{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# `vaex` @ PyData Budapest 2020\n",
    "\n",
    "## New York Taxi Dataset (2009-2015): Exploratory Data Analysis\n",
    "\n",
    "https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page\n",
    "\n",
    "\n",
    "Running this notebooks requires `vaex==3.0.0`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:00:13.441260Z",
     "start_time": "2020-06-10T17:00:12.512651Z"
    }
   },
   "outputs": [],
   "source": [
    "import vaex\n",
    "from vaex.ui.colormaps import cm_plusmin\n",
    "\n",
    "import numpy as np\n",
    "\n",
    "import pylab as plt\n",
    "import seaborn as sns\n",
    "\n",
    "import pandas as pd\n",
    "pd.options.display.max_rows = 70\n",
    "\n",
    "import warnings; warnings.simplefilter('ignore')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Main concepts behind `vaex`:\n",
    " - Memory mapping\n",
    " - Lazy evaluations\n",
    " - Expression system (\"virtual\" columns)\n",
    " - High-performance algorithms"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Memory mapping"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:00:26.643586Z",
     "start_time": "2020-06-10T17:00:26.520339Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "164G\t/data/yellow_taxi_2009_2015.hdf5\r\n",
      "108G\t/data/yellow_taxi_2009_2015_f32.hdf5\r\n",
      "12G\t/data/yellow_taxi_2015_f32.hdf5\r\n"
     ]
    }
   ],
   "source": [
    "!du -h /data/yellow*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Get instant access to your data!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:00:56.383625Z",
     "start_time": "2020-06-10T17:00:56.373900Z"
    }
   },
   "outputs": [],
   "source": [
    "df = vaex.open('/data/yellow_taxi_2009_2015_f32.hdf5')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can also stream it directly from S3:\n",
    "```\n",
    "df = vaex.open('s3://vaex/taxi/yellow_taxi_2015_f32s.hdf5?anon=true')\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Lazy evaluations\n",
    "\n",
    "Just get a quick preview whenever you want to \"peak\" at your data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:02:00.211568Z",
     "start_time": "2020-06-10T17:02:00.189830Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead>\n",
       "<tr><th>#                                        </th><th>vendor_id  </th><th>pickup_datetime              </th><th>dropoff_datetime             </th><th>passenger_count  </th><th>payment_type  </th><th>trip_distance     </th><th>pickup_longitude  </th><th>pickup_latitude   </th><th>rate_code  </th><th>store_and_fwd_flag  </th><th>dropoff_longitude  </th><th>dropoff_latitude  </th><th>fare_amount       </th><th>surcharge  </th><th>mta_tax  </th><th>tip_amount        </th><th>tolls_amount  </th><th>total_amount      </th></tr>\n",
       "</thead>\n",
       "<tbody>\n",
       "<tr><td><i style='opacity: 0.6'>0</i>            </td><td>VTS        </td><td>2009-01-04 02:52:00.000000000</td><td>2009-01-04 03:02:00.000000000</td><td>1                </td><td>CASH          </td><td>2.630000114440918 </td><td>-73.99195861816406</td><td>40.72156524658203 </td><td>nan        </td><td>nan                 </td><td>-73.99380493164062 </td><td>40.6959228515625  </td><td>8.899999618530273 </td><td>0.5        </td><td>nan      </td><td>0.0               </td><td>0.0           </td><td>9.399999618530273 </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1</i>            </td><td>VTS        </td><td>2009-01-04 03:31:00.000000000</td><td>2009-01-04 03:38:00.000000000</td><td>3                </td><td>Credit        </td><td>4.550000190734863 </td><td>-73.98210144042969</td><td>40.736289978027344</td><td>nan        </td><td>nan                 </td><td>-73.95584869384766 </td><td>40.768028259277344</td><td>12.100000381469727</td><td>0.5        </td><td>nan      </td><td>2.0               </td><td>0.0           </td><td>14.600000381469727</td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>2</i>            </td><td>VTS        </td><td>2009-01-03 15:43:00.000000000</td><td>2009-01-03 15:57:00.000000000</td><td>5                </td><td>Credit        </td><td>10.350000381469727</td><td>-74.0025863647461 </td><td>40.73974609375    </td><td>nan        </td><td>nan                 </td><td>-73.86997985839844 </td><td>40.770225524902344</td><td>23.700000762939453</td><td>0.0        </td><td>nan      </td><td>4.739999771118164 </td><td>0.0           </td><td>28.440000534057617</td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>3</i>            </td><td>DDS        </td><td>2009-01-01 20:52:58.000000000</td><td>2009-01-01 21:14:00.000000000</td><td>1                </td><td>CREDIT        </td><td>5.0               </td><td>-73.9742660522461 </td><td>40.79095458984375 </td><td>nan        </td><td>nan                 </td><td>-73.9965591430664  </td><td>40.731849670410156</td><td>14.899999618530273</td><td>0.5        </td><td>nan      </td><td>3.049999952316284 </td><td>0.0           </td><td>18.450000762939453</td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>4</i>            </td><td>DDS        </td><td>2009-01-24 16:18:23.000000000</td><td>2009-01-24 16:24:56.000000000</td><td>1                </td><td>CASH          </td><td>0.4000000059604645</td><td>-74.00157928466797</td><td>40.719383239746094</td><td>nan        </td><td>nan                 </td><td>-74.00837707519531 </td><td>40.7203483581543  </td><td>3.700000047683716 </td><td>0.0        </td><td>nan      </td><td>0.0               </td><td>0.0           </td><td>3.700000047683716 </td></tr>\n",
       "<tr><td>...                                      </td><td>...        </td><td>...                          </td><td>...                          </td><td>...              </td><td>...           </td><td>...               </td><td>...               </td><td>...               </td><td>...        </td><td>...                 </td><td>...                </td><td>...               </td><td>...               </td><td>...        </td><td>...      </td><td>...               </td><td>...           </td><td>...               </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,173,057,922</i></td><td>VTS        </td><td>2015-12-31 23:59:56.000000000</td><td>2016-01-01 00:08:18.000000000</td><td>5                </td><td>1             </td><td>1.2000000476837158</td><td>-73.99381256103516</td><td>40.72087097167969 </td><td>1.0        </td><td>0.0                 </td><td>-73.98621368408203 </td><td>40.722469329833984</td><td>7.5               </td><td>0.5        </td><td>0.5      </td><td>1.7599999904632568</td><td>0.0           </td><td>10.5600004196167  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,173,057,923</i></td><td>CMT        </td><td>2015-12-31 23:59:58.000000000</td><td>2016-01-01 00:05:19.000000000</td><td>2                </td><td>2             </td><td>2.0               </td><td>-73.96527099609375</td><td>40.76028060913086 </td><td>1.0        </td><td>0.0                 </td><td>-73.93951416015625 </td><td>40.75238800048828 </td><td>7.5               </td><td>0.5        </td><td>0.5      </td><td>0.0               </td><td>0.0           </td><td>8.800000190734863 </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,173,057,924</i></td><td>CMT        </td><td>2015-12-31 23:59:59.000000000</td><td>2016-01-01 00:12:55.000000000</td><td>2                </td><td>2             </td><td>3.799999952316284 </td><td>-73.98729705810547</td><td>40.739078521728516</td><td>1.0        </td><td>0.0                 </td><td>-73.9886703491211  </td><td>40.69329833984375 </td><td>13.5              </td><td>0.5        </td><td>0.5      </td><td>0.0               </td><td>0.0           </td><td>14.800000190734863</td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,173,057,925</i></td><td>VTS        </td><td>2015-12-31 23:59:59.000000000</td><td>2016-01-01 00:10:26.000000000</td><td>1                </td><td>2             </td><td>1.9600000381469727</td><td>-73.99755859375   </td><td>40.72569274902344 </td><td>1.0        </td><td>0.0                 </td><td>-74.01712036132812 </td><td>40.705322265625   </td><td>8.5               </td><td>0.5        </td><td>0.5      </td><td>0.0               </td><td>0.0           </td><td>9.800000190734863 </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,173,057,926</i></td><td>VTS        </td><td>2015-12-31 23:59:59.000000000</td><td>2016-01-01 00:21:30.000000000</td><td>1                </td><td>1             </td><td>1.059999942779541 </td><td>-73.9843978881836 </td><td>40.76725769042969 </td><td>1.0        </td><td>0.0                 </td><td>-73.99098205566406 </td><td>40.76057052612305 </td><td>13.5              </td><td>0.5        </td><td>0.5      </td><td>2.9600000381469727</td><td>0.0           </td><td>17.760000228881836</td></tr>\n",
       "</tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "#              vendor_id    pickup_datetime                dropoff_datetime               passenger_count    payment_type    trip_distance       pickup_longitude    pickup_latitude     rate_code    store_and_fwd_flag    dropoff_longitude    dropoff_latitude    fare_amount         surcharge    mta_tax    tip_amount          tolls_amount    total_amount\n",
       "0              VTS          2009-01-04 02:52:00.000000000  2009-01-04 03:02:00.000000000  1                  CASH            2.630000114440918   -73.99195861816406  40.72156524658203   nan          nan                   -73.99380493164062   40.6959228515625    8.899999618530273   0.5          nan        0.0                 0.0             9.399999618530273\n",
       "1              VTS          2009-01-04 03:31:00.000000000  2009-01-04 03:38:00.000000000  3                  Credit          4.550000190734863   -73.98210144042969  40.736289978027344  nan          nan                   -73.95584869384766   40.768028259277344  12.100000381469727  0.5          nan        2.0                 0.0             14.600000381469727\n",
       "2              VTS          2009-01-03 15:43:00.000000000  2009-01-03 15:57:00.000000000  5                  Credit          10.350000381469727  -74.0025863647461   40.73974609375      nan          nan                   -73.86997985839844   40.770225524902344  23.700000762939453  0.0          nan        4.739999771118164   0.0             28.440000534057617\n",
       "3              DDS          2009-01-01 20:52:58.000000000  2009-01-01 21:14:00.000000000  1                  CREDIT          5.0                 -73.9742660522461   40.79095458984375   nan          nan                   -73.9965591430664    40.731849670410156  14.899999618530273  0.5          nan        3.049999952316284   0.0             18.450000762939453\n",
       "4              DDS          2009-01-24 16:18:23.000000000  2009-01-24 16:24:56.000000000  1                  CASH            0.4000000059604645  -74.00157928466797  40.719383239746094  nan          nan                   -74.00837707519531   40.7203483581543    3.700000047683716   0.0          nan        0.0                 0.0             3.700000047683716\n",
       "...            ...          ...                            ...                            ...                ...             ...                 ...                 ...                 ...          ...                   ...                  ...                 ...                 ...          ...        ...                 ...             ...\n",
       "1,173,057,922  VTS          2015-12-31 23:59:56.000000000  2016-01-01 00:08:18.000000000  5                  1               1.2000000476837158  -73.99381256103516  40.72087097167969   1.0          0.0                   -73.98621368408203   40.722469329833984  7.5                 0.5          0.5        1.7599999904632568  0.0             10.5600004196167\n",
       "1,173,057,923  CMT          2015-12-31 23:59:58.000000000  2016-01-01 00:05:19.000000000  2                  2               2.0                 -73.96527099609375  40.76028060913086   1.0          0.0                   -73.93951416015625   40.75238800048828   7.5                 0.5          0.5        0.0                 0.0             8.800000190734863\n",
       "1,173,057,924  CMT          2015-12-31 23:59:59.000000000  2016-01-01 00:12:55.000000000  2                  2               3.799999952316284   -73.98729705810547  40.739078521728516  1.0          0.0                   -73.9886703491211    40.69329833984375   13.5                0.5          0.5        0.0                 0.0             14.800000190734863\n",
       "1,173,057,925  VTS          2015-12-31 23:59:59.000000000  2016-01-01 00:10:26.000000000  1                  2               1.9600000381469727  -73.99755859375     40.72569274902344   1.0          0.0                   -74.01712036132812   40.705322265625     8.5                 0.5          0.5        0.0                 0.0             9.800000190734863\n",
       "1,173,057,926  VTS          2015-12-31 23:59:59.000000000  2016-01-01 00:21:30.000000000  1                  1               1.059999942779541   -73.9843978881836   40.76725769042969   1.0          0.0                   -73.99098205566406   40.76057052612305   13.5                0.5          0.5        2.9600000381469727  0.0             17.760000228881836"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Expression system (\"virtual\" columns)\n",
    "\n",
    "We call a single \"column\" an \"expression\""
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:02:16.002450Z",
     "start_time": "2020-06-10T17:02:15.992533Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Expression = tip_amount\n",
       "Length: 1,173,057,927 dtype: float32 (column)\n",
       "---------------------------------------------\n",
       "         0     0\n",
       "         1     2\n",
       "         2  4.74\n",
       "         3  3.05\n",
       "         4     0\n",
       "      ...       \n",
       "1173057922  1.76\n",
       "1173057923     0\n",
       "1173057924     0\n",
       "1173057925     0\n",
       "1173057926  2.96"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.tip_amount"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-03T08:13:57.992614Z",
     "start_time": "2020-06-03T08:13:57.989452Z"
    }
   },
   "source": [
    "Defining new columns takes no memory"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:02:39.420566Z",
     "start_time": "2020-06-10T17:02:39.409580Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Expression = tip_amount\n",
       "Length: 1,173,057,927 dtype: float32 (column)\n",
       "---------------------------------------------\n",
       "         0     0\n",
       "         1     2\n",
       "         2  4.74\n",
       "         3  3.05\n",
       "         4     0\n",
       "      ...       \n",
       "1173057922  1.76\n",
       "1173057923     0\n",
       "1173057924     0\n",
       "1173057925     0\n",
       "1173057926  2.96"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df['tip_percentage'] = df.tip_amount / df.total_amount\n",
    "df.tip_amount"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Peeking at the data is instant"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:03:09.034021Z",
     "start_time": "2020-06-10T17:03:09.009857Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead>\n",
       "<tr><th>#                                        </th><th>vendor_id  </th><th>pickup_datetime              </th><th>dropoff_datetime             </th><th>passenger_count  </th><th>payment_type  </th><th>trip_distance     </th><th>pickup_longitude  </th><th>pickup_latitude   </th><th>rate_code  </th><th>store_and_fwd_flag  </th><th>dropoff_longitude  </th><th>dropoff_latitude  </th><th>fare_amount       </th><th>surcharge  </th><th>mta_tax  </th><th>tip_amount        </th><th>tolls_amount  </th><th>total_amount      </th><th>tip_percentage     </th></tr>\n",
       "</thead>\n",
       "<tbody>\n",
       "<tr><td><i style='opacity: 0.6'>0</i>            </td><td>VTS        </td><td>2009-01-04 02:52:00.000000000</td><td>2009-01-04 03:02:00.000000000</td><td>1                </td><td>CASH          </td><td>2.630000114440918 </td><td>-73.99195861816406</td><td>40.72156524658203 </td><td>nan        </td><td>nan                 </td><td>-73.99380493164062 </td><td>40.6959228515625  </td><td>8.899999618530273 </td><td>0.5        </td><td>nan      </td><td>0.0               </td><td>0.0           </td><td>9.399999618530273 </td><td>0.0                </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1</i>            </td><td>VTS        </td><td>2009-01-04 03:31:00.000000000</td><td>2009-01-04 03:38:00.000000000</td><td>3                </td><td>Credit        </td><td>4.550000190734863 </td><td>-73.98210144042969</td><td>40.736289978027344</td><td>nan        </td><td>nan                 </td><td>-73.95584869384766 </td><td>40.768028259277344</td><td>12.100000381469727</td><td>0.5        </td><td>nan      </td><td>2.0               </td><td>0.0           </td><td>14.600000381469727</td><td>0.13698630034923553</td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>2</i>            </td><td>VTS        </td><td>2009-01-03 15:43:00.000000000</td><td>2009-01-03 15:57:00.000000000</td><td>5                </td><td>Credit        </td><td>10.350000381469727</td><td>-74.0025863647461 </td><td>40.73974609375    </td><td>nan        </td><td>nan                 </td><td>-73.86997985839844 </td><td>40.770225524902344</td><td>23.700000762939453</td><td>0.0        </td><td>nan      </td><td>4.739999771118164 </td><td>0.0           </td><td>28.440000534057617</td><td>0.1666666567325592 </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>3</i>            </td><td>DDS        </td><td>2009-01-01 20:52:58.000000000</td><td>2009-01-01 21:14:00.000000000</td><td>1                </td><td>CREDIT        </td><td>5.0               </td><td>-73.9742660522461 </td><td>40.79095458984375 </td><td>nan        </td><td>nan                 </td><td>-73.9965591430664  </td><td>40.731849670410156</td><td>14.899999618530273</td><td>0.5        </td><td>nan      </td><td>3.049999952316284 </td><td>0.0           </td><td>18.450000762939453</td><td>0.16531164944171906</td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>4</i>            </td><td>DDS        </td><td>2009-01-24 16:18:23.000000000</td><td>2009-01-24 16:24:56.000000000</td><td>1                </td><td>CASH          </td><td>0.4000000059604645</td><td>-74.00157928466797</td><td>40.719383239746094</td><td>nan        </td><td>nan                 </td><td>-74.00837707519531 </td><td>40.7203483581543  </td><td>3.700000047683716 </td><td>0.0        </td><td>nan      </td><td>0.0               </td><td>0.0           </td><td>3.700000047683716 </td><td>0.0                </td></tr>\n",
       "<tr><td>...                                      </td><td>...        </td><td>...                          </td><td>...                          </td><td>...              </td><td>...           </td><td>...               </td><td>...               </td><td>...               </td><td>...        </td><td>...                 </td><td>...                </td><td>...               </td><td>...               </td><td>...        </td><td>...      </td><td>...               </td><td>...           </td><td>...               </td><td>...                </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,173,057,922</i></td><td>VTS        </td><td>2015-12-31 23:59:56.000000000</td><td>2016-01-01 00:08:18.000000000</td><td>5                </td><td>1             </td><td>1.2000000476837158</td><td>-73.99381256103516</td><td>40.72087097167969 </td><td>1.0        </td><td>0.0                 </td><td>-73.98621368408203 </td><td>40.722469329833984</td><td>7.5               </td><td>0.5        </td><td>0.5      </td><td>1.7599999904632568</td><td>0.0           </td><td>10.5600004196167  </td><td>0.1666666567325592 </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,173,057,923</i></td><td>CMT        </td><td>2015-12-31 23:59:58.000000000</td><td>2016-01-01 00:05:19.000000000</td><td>2                </td><td>2             </td><td>2.0               </td><td>-73.96527099609375</td><td>40.76028060913086 </td><td>1.0        </td><td>0.0                 </td><td>-73.93951416015625 </td><td>40.75238800048828 </td><td>7.5               </td><td>0.5        </td><td>0.5      </td><td>0.0               </td><td>0.0           </td><td>8.800000190734863 </td><td>0.0                </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,173,057,924</i></td><td>CMT        </td><td>2015-12-31 23:59:59.000000000</td><td>2016-01-01 00:12:55.000000000</td><td>2                </td><td>2             </td><td>3.799999952316284 </td><td>-73.98729705810547</td><td>40.739078521728516</td><td>1.0        </td><td>0.0                 </td><td>-73.9886703491211  </td><td>40.69329833984375 </td><td>13.5              </td><td>0.5        </td><td>0.5      </td><td>0.0               </td><td>0.0           </td><td>14.800000190734863</td><td>0.0                </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,173,057,925</i></td><td>VTS        </td><td>2015-12-31 23:59:59.000000000</td><td>2016-01-01 00:10:26.000000000</td><td>1                </td><td>2             </td><td>1.9600000381469727</td><td>-73.99755859375   </td><td>40.72569274902344 </td><td>1.0        </td><td>0.0                 </td><td>-74.01712036132812 </td><td>40.705322265625   </td><td>8.5               </td><td>0.5        </td><td>0.5      </td><td>0.0               </td><td>0.0           </td><td>9.800000190734863 </td><td>0.0                </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,173,057,926</i></td><td>VTS        </td><td>2015-12-31 23:59:59.000000000</td><td>2016-01-01 00:21:30.000000000</td><td>1                </td><td>1             </td><td>1.059999942779541 </td><td>-73.9843978881836 </td><td>40.76725769042969 </td><td>1.0        </td><td>0.0                 </td><td>-73.99098205566406 </td><td>40.76057052612305 </td><td>13.5              </td><td>0.5        </td><td>0.5      </td><td>2.9600000381469727</td><td>0.0           </td><td>17.760000228881836</td><td>0.1666666716337204 </td></tr>\n",
       "</tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "#              vendor_id    pickup_datetime                dropoff_datetime               passenger_count    payment_type    trip_distance       pickup_longitude    pickup_latitude     rate_code    store_and_fwd_flag    dropoff_longitude    dropoff_latitude    fare_amount         surcharge    mta_tax    tip_amount          tolls_amount    total_amount        tip_percentage\n",
       "0              VTS          2009-01-04 02:52:00.000000000  2009-01-04 03:02:00.000000000  1                  CASH            2.630000114440918   -73.99195861816406  40.72156524658203   nan          nan                   -73.99380493164062   40.6959228515625    8.899999618530273   0.5          nan        0.0                 0.0             9.399999618530273   0.0\n",
       "1              VTS          2009-01-04 03:31:00.000000000  2009-01-04 03:38:00.000000000  3                  Credit          4.550000190734863   -73.98210144042969  40.736289978027344  nan          nan                   -73.95584869384766   40.768028259277344  12.100000381469727  0.5          nan        2.0                 0.0             14.600000381469727  0.13698630034923553\n",
       "2              VTS          2009-01-03 15:43:00.000000000  2009-01-03 15:57:00.000000000  5                  Credit          10.350000381469727  -74.0025863647461   40.73974609375      nan          nan                   -73.86997985839844   40.770225524902344  23.700000762939453  0.0          nan        4.739999771118164   0.0             28.440000534057617  0.1666666567325592\n",
       "3              DDS          2009-01-01 20:52:58.000000000  2009-01-01 21:14:00.000000000  1                  CREDIT          5.0                 -73.9742660522461   40.79095458984375   nan          nan                   -73.9965591430664    40.731849670410156  14.899999618530273  0.5          nan        3.049999952316284   0.0             18.450000762939453  0.16531164944171906\n",
       "4              DDS          2009-01-24 16:18:23.000000000  2009-01-24 16:24:56.000000000  1                  CASH            0.4000000059604645  -74.00157928466797  40.719383239746094  nan          nan                   -74.00837707519531   40.7203483581543    3.700000047683716   0.0          nan        0.0                 0.0             3.700000047683716   0.0\n",
       "...            ...          ...                            ...                            ...                ...             ...                 ...                 ...                 ...          ...                   ...                  ...                 ...                 ...          ...        ...                 ...             ...                 ...\n",
       "1,173,057,922  VTS          2015-12-31 23:59:56.000000000  2016-01-01 00:08:18.000000000  5                  1               1.2000000476837158  -73.99381256103516  40.72087097167969   1.0          0.0                   -73.98621368408203   40.722469329833984  7.5                 0.5          0.5        1.7599999904632568  0.0             10.5600004196167    0.1666666567325592\n",
       "1,173,057,923  CMT          2015-12-31 23:59:58.000000000  2016-01-01 00:05:19.000000000  2                  2               2.0                 -73.96527099609375  40.76028060913086   1.0          0.0                   -73.93951416015625   40.75238800048828   7.5                 0.5          0.5        0.0                 0.0             8.800000190734863   0.0\n",
       "1,173,057,924  CMT          2015-12-31 23:59:59.000000000  2016-01-01 00:12:55.000000000  2                  2               3.799999952316284   -73.98729705810547  40.739078521728516  1.0          0.0                   -73.9886703491211    40.69329833984375   13.5                0.5          0.5        0.0                 0.0             14.800000190734863  0.0\n",
       "1,173,057,925  VTS          2015-12-31 23:59:59.000000000  2016-01-01 00:10:26.000000000  1                  2               1.9600000381469727  -73.99755859375     40.72569274902344   1.0          0.0                   -74.01712036132812   40.705322265625     8.5                 0.5          0.5        0.0                 0.0             9.800000190734863   0.0\n",
       "1,173,057,926  VTS          2015-12-31 23:59:59.000000000  2016-01-01 00:21:30.000000000  1                  1               1.059999942779541   -73.9843978881836   40.76725769042969   1.0          0.0                   -73.99098205566406   40.76057052612305   13.5                0.5          0.5        2.9600000381469727  0.0             17.760000228881836  0.1666666716337204"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Vaex knows when to be lazy, and when to be eager:\n",
    " - If the output of an operation is a new column, vaex will be lazy\n",
    " - If the output of an operation is expected to be a new data strugture (single number, list etc..), vaex will be eager"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:03:54.043968Z",
     "start_time": "2020-06-10T17:03:52.286886Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(inf)"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.tip_percentage.mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Filtering creates a shallow copy of the DataFrame. The data itself is not copied!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:04:21.114765Z",
     "start_time": "2020-06-10T17:04:20.683370Z"
    }
   },
   "outputs": [],
   "source": [
    "df_filtered = df[df.total_amount>0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:04:25.694355Z",
     "start_time": "2020-06-10T17:04:23.127770Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(0.07121788)"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_filtered.tip_percentage.mean()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### High performance, efficient algorithms"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:04:40.772052Z",
     "start_time": "2020-06-10T17:04:40.768923Z"
    }
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Number of rows: 1,173,057,927\n",
      "Number of columns: 19\n"
     ]
    }
   ],
   "source": [
    "# Check length of file\n",
    "rows, columns = df.shape\n",
    "print(f'Number of rows: {rows:,}')\n",
    "print(f'Number of columns: {columns}')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:05:30.735758Z",
     "start_time": "2020-06-10T17:04:53.283836Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>vendor_id</th>\n",
       "      <th>pickup_datetime</th>\n",
       "      <th>dropoff_datetime</th>\n",
       "      <th>passenger_count</th>\n",
       "      <th>payment_type</th>\n",
       "      <th>trip_distance</th>\n",
       "      <th>pickup_longitude</th>\n",
       "      <th>pickup_latitude</th>\n",
       "      <th>rate_code</th>\n",
       "      <th>store_and_fwd_flag</th>\n",
       "      <th>dropoff_longitude</th>\n",
       "      <th>dropoff_latitude</th>\n",
       "      <th>fare_amount</th>\n",
       "      <th>surcharge</th>\n",
       "      <th>mta_tax</th>\n",
       "      <th>tip_amount</th>\n",
       "      <th>tolls_amount</th>\n",
       "      <th>total_amount</th>\n",
       "      <th>tip_percentage</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>dtype</th>\n",
       "      <td>str</td>\n",
       "      <td>datetime64[ns]</td>\n",
       "      <td>datetime64[ns]</td>\n",
       "      <td>int64</td>\n",
       "      <td>str</td>\n",
       "      <td>float32</td>\n",
       "      <td>float32</td>\n",
       "      <td>float32</td>\n",
       "      <td>float32</td>\n",
       "      <td>float32</td>\n",
       "      <td>float32</td>\n",
       "      <td>float32</td>\n",
       "      <td>float32</td>\n",
       "      <td>float32</td>\n",
       "      <td>float32</td>\n",
       "      <td>float32</td>\n",
       "      <td>float32</td>\n",
       "      <td>float32</td>\n",
       "      <td>float32</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>count</th>\n",
       "      <td>1173057927</td>\n",
       "      <td>1173057927</td>\n",
       "      <td>1173057927</td>\n",
       "      <td>1173057927</td>\n",
       "      <td>1173057927</td>\n",
       "      <td>1173057927</td>\n",
       "      <td>1173057927</td>\n",
       "      <td>1173057926</td>\n",
       "      <td>1002161871</td>\n",
       "      <td>638914438</td>\n",
       "      <td>1173043432</td>\n",
       "      <td>1173050240</td>\n",
       "      <td>1173057925</td>\n",
       "      <td>1173057925</td>\n",
       "      <td>1032017356</td>\n",
       "      <td>1173057925</td>\n",
       "      <td>1173057925</td>\n",
       "      <td>1173057925</td>\n",
       "      <td>1173034681</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>NA</th>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>1</td>\n",
       "      <td>170896056</td>\n",
       "      <td>534143489</td>\n",
       "      <td>14495</td>\n",
       "      <td>7687</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>141040571</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>23246</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>mean</th>\n",
       "      <td>--</td>\n",
       "      <td>1970-01-01T00:00:01.953533625</td>\n",
       "      <td>1970-01-01T00:00:14.506598422</td>\n",
       "      <td>1.6844313554517245</td>\n",
       "      <td>--</td>\n",
       "      <td>5.390923660999704</td>\n",
       "      <td>-72.53224844702991</td>\n",
       "      <td>39.9345313935188</td>\n",
       "      <td>1.035820754150404</td>\n",
       "      <td>0.017168377090266976</td>\n",
       "      <td>-72.53741806425096</td>\n",
       "      <td>39.93694872311038</td>\n",
       "      <td>11.217308155801057</td>\n",
       "      <td>0.3036385232379655</td>\n",
       "      <td>0.4963069205116383</td>\n",
       "      <td>1.1294571893027419</td>\n",
       "      <td>0.1867806751775823</td>\n",
       "      <td>13.314765814201305</td>\n",
       "      <td>inf</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>std</th>\n",
       "      <td>--</td>\n",
       "      <td>6.22239e+16</td>\n",
       "      <td>6.22266e+16</td>\n",
       "      <td>1.33032</td>\n",
       "      <td>--</td>\n",
       "      <td>7756.52</td>\n",
       "      <td>12.7505</td>\n",
       "      <td>9.51675</td>\n",
       "      <td>0.441996</td>\n",
       "      <td>0.129899</td>\n",
       "      <td>12.6768</td>\n",
       "      <td>9.50487</td>\n",
       "      <td>633.505</td>\n",
       "      <td>0.395407</td>\n",
       "      <td>0.0683994</td>\n",
       "      <td>132.842</td>\n",
       "      <td>886.718</td>\n",
       "      <td>1098.43</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>min</th>\n",
       "      <td>--</td>\n",
       "      <td>2009-01-01T00:00:27.365015552</td>\n",
       "      <td>1899-12-31T23:59:43.370698752</td>\n",
       "      <td>0</td>\n",
       "      <td>--</td>\n",
       "      <td>-4.08401e+07</td>\n",
       "      <td>-3509.02</td>\n",
       "      <td>-3579.14</td>\n",
       "      <td>0</td>\n",
       "      <td>0</td>\n",
       "      <td>-3579.14</td>\n",
       "      <td>-3579.14</td>\n",
       "      <td>-2.14748e+07</td>\n",
       "      <td>-79</td>\n",
       "      <td>-3</td>\n",
       "      <td>-1.67772e+06</td>\n",
       "      <td>-2.14748e+07</td>\n",
       "      <td>-2.14748e+07</td>\n",
       "      <td>-0.0877193</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>max</th>\n",
       "      <td>--</td>\n",
       "      <td>2016-01-01T00:00:49.632313344</td>\n",
       "      <td>2253-08-23T08:00:13.061652480</td>\n",
       "      <td>255</td>\n",
       "      <td>--</td>\n",
       "      <td>1.98623e+08</td>\n",
       "      <td>3570.22</td>\n",
       "      <td>3577.14</td>\n",
       "      <td>252</td>\n",
       "      <td>2</td>\n",
       "      <td>3460.43</td>\n",
       "      <td>3577.14</td>\n",
       "      <td>825999</td>\n",
       "      <td>999.99</td>\n",
       "      <td>1311.22</td>\n",
       "      <td>3.95059e+06</td>\n",
       "      <td>5510.07</td>\n",
       "      <td>3.95061e+06</td>\n",
       "      <td>inf</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        vendor_id                pickup_datetime  \\\n",
       "dtype         str                 datetime64[ns]   \n",
       "count  1173057927                     1173057927   \n",
       "NA              0                              0   \n",
       "mean           --  1970-01-01T00:00:01.953533625   \n",
       "std            --                    6.22239e+16   \n",
       "min            --  2009-01-01T00:00:27.365015552   \n",
       "max            --  2016-01-01T00:00:49.632313344   \n",
       "\n",
       "                    dropoff_datetime     passenger_count payment_type  \\\n",
       "dtype                 datetime64[ns]               int64          str   \n",
       "count                     1173057927          1173057927   1173057927   \n",
       "NA                                 0                   0            0   \n",
       "mean   1970-01-01T00:00:14.506598422  1.6844313554517245           --   \n",
       "std                      6.22266e+16             1.33032           --   \n",
       "min    1899-12-31T23:59:43.370698752                   0           --   \n",
       "max    2253-08-23T08:00:13.061652480                 255           --   \n",
       "\n",
       "           trip_distance    pickup_longitude   pickup_latitude  \\\n",
       "dtype            float32             float32           float32   \n",
       "count         1173057927          1173057927        1173057926   \n",
       "NA                     0                   0                 1   \n",
       "mean   5.390923660999704  -72.53224844702991  39.9345313935188   \n",
       "std              7756.52             12.7505           9.51675   \n",
       "min         -4.08401e+07            -3509.02          -3579.14   \n",
       "max          1.98623e+08             3570.22           3577.14   \n",
       "\n",
       "               rate_code    store_and_fwd_flag   dropoff_longitude  \\\n",
       "dtype            float32               float32             float32   \n",
       "count         1002161871             638914438          1173043432   \n",
       "NA             170896056             534143489               14495   \n",
       "mean   1.035820754150404  0.017168377090266976  -72.53741806425096   \n",
       "std             0.441996              0.129899             12.6768   \n",
       "min                    0                     0            -3579.14   \n",
       "max                  252                     2             3460.43   \n",
       "\n",
       "        dropoff_latitude         fare_amount           surcharge  \\\n",
       "dtype            float32             float32             float32   \n",
       "count         1173050240          1173057925          1173057925   \n",
       "NA                  7687                   2                   2   \n",
       "mean   39.93694872311038  11.217308155801057  0.3036385232379655   \n",
       "std              9.50487             633.505            0.395407   \n",
       "min             -3579.14        -2.14748e+07                 -79   \n",
       "max              3577.14              825999              999.99   \n",
       "\n",
       "                  mta_tax          tip_amount        tolls_amount  \\\n",
       "dtype             float32             float32             float32   \n",
       "count          1032017356          1173057925          1173057925   \n",
       "NA              141040571                   2                   2   \n",
       "mean   0.4963069205116383  1.1294571893027419  0.1867806751775823   \n",
       "std             0.0683994             132.842             886.718   \n",
       "min                    -3        -1.67772e+06        -2.14748e+07   \n",
       "max               1311.22         3.95059e+06             5510.07   \n",
       "\n",
       "             total_amount tip_percentage  \n",
       "dtype             float32        float32  \n",
       "count          1173057925     1173034681  \n",
       "NA                      2          23246  \n",
       "mean   13.314765814201305            inf  \n",
       "std               1098.43            NaN  \n",
       "min          -2.14748e+07     -0.0877193  \n",
       "max           3.95061e+06            inf  "
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.describe()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Application: Exploring and cleaning the New York Taxi dataset\n",
    "\n",
    "### Remove missing data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:06:42.740441Z",
     "start_time": "2020-06-10T17:06:42.490207Z"
    }
   },
   "outputs": [],
   "source": [
    "# Drop NANs\n",
    "df = df.dropna(column_names=['dropoff_latitude', 'dropoff_longitude', 'pickup_latitude'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Abnormal number of passengers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:08:15.225487Z",
     "start_time": "2020-06-10T17:08:14.641536Z"
    }
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "d09a0473b2b84407b336f0d8da40447d",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1.0), Label(value='In progress...')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "1      812315955\n",
       "2      172863547\n",
       "5       81923905\n",
       "3       51435661\n",
       "6       25614703\n",
       "4       24983364\n",
       "0        3903564\n",
       "208         1515\n",
       "7            435\n",
       "9            352\n",
       "8            313\n",
       "49            26\n",
       "10            17\n",
       "255           10\n",
       "129            7\n",
       "213            4\n",
       "250            3\n",
       "65             3\n",
       "15             2\n",
       "58             2\n",
       "33             2\n",
       "169            1\n",
       "37             1\n",
       "36             1\n",
       "34             1\n",
       "25             1\n",
       "19             1\n",
       "17             1\n",
       "193            1\n",
       "13             1\n",
       "47             1\n",
       "211            1\n",
       "223            1\n",
       "225            1\n",
       "229            1\n",
       "232            1\n",
       "247            1\n",
       "249            1\n",
       "38             1\n",
       "51             1\n",
       "177            1\n",
       "165            1\n",
       "164            1\n",
       "163            1\n",
       "160            1\n",
       "158            1\n",
       "155            1\n",
       "141            1\n",
       "137            1\n",
       "134            1\n",
       "133            1\n",
       "125            1\n",
       "113            1\n",
       "97             1\n",
       "91             1\n",
       "84             1\n",
       "254            1\n",
       "69             1\n",
       "66             1\n",
       "61             1\n",
       "53             1\n",
       "70             1\n",
       "dtype: int64"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df.passenger_count.value_counts(progress='widget')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:08:30.202981Z",
     "start_time": "2020-06-10T17:08:29.915240Z"
    }
   },
   "outputs": [],
   "source": [
    "# Filter abnormal number of passengers\n",
    "df = df[(df.passenger_count>0) & (df.passenger_count<7)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Clean up distance values"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:08:44.552308Z",
     "start_time": "2020-06-10T17:08:41.812832Z"
    }
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "b3268bb3c8a84ae297bf80245cf40135",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1.0), Label(value='In progress...')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAjgAAAEYCAYAAABRMYxdAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+j8jraAAAVCUlEQVR4nO3df5RndX3f8eeLH6mn6EboLrgubNdwSOJqmzXdAypNzhpNijRmExt/QNGNJV1s/BF7TFua9ERNT87hHKMeq0allYBWUCMhi0fqLw6KNoS46CowxMpBJcCEXSQ60B8mS9/943unnQ7znfnuzHzvd+Yzz8c5c+b7vffzvffN3LnDaz/3c+8nVYUkSVJLjpt0AZIkSavNgCNJkppjwJEkSc0x4EiSpOYYcCRJUnNOmHQBo9i8eXPt2LFj0mVIkqQ15rbbbnuoqrbMX74uAs6OHTs4ePDgpMuQJElrTJLvLLTcS1SSJKk5BhxJktQcA44kSWqOAUeSJDXHgCNJkppjwJEkSc0x4EiSpOYYcCRJUnPWxYP+NHlX33ovBw7dP3T93l3buPCc7T1WJEnScPbgaCQHDt3P1PTMguumpmcWDT+SJPXNHhyNbOfWTXz0kuc8bvnL3n/LBKqRJGk4e3AkSVJzDDiSJKk5BhxJktQcA44kSWrO2AJOkjOS3JTkriR3Jvn1bvkpST6b5Jvd95PHVYMkSdqYxtmDcxR4Y1U9HXg28JokO4FLgRur6izgxu69JEnSqhlbwKmq6ar6Svf6EeAuYBuwF7iqa3YV8IvjqkGSJG1MvYzBSbIDeBZwK3BaVU3DIAQBpw75zP4kB5McPHLkSB9lSpKkRow94CR5InAt8IaqWvhRuAuoqsurandV7d6yZcv4CpQkSc0Za8BJciKDcPPhqvqjbvGDSbZ267cCh8dZgyRJ2njGeRdVgA8Ad1XV2+esuh7Y173eBxwYVw2SJGljGudcVOcCrwBuT3KoW/abwGXAx5JcDNwLvGSMNUiSpA1obAGnqr4EZMjq549rv5IkST7JWJIkNceAI0mSmmPAkSRJzTHgSJKk5hhwJElScww4kiSpOQYcSZLUHAOOJElqjgFHkiQ1x4AjSZKaY8CRJEnNMeBIkqTmGHAkSVJzDDiSJKk5BhxJktQcA44kSWqOAUeSJDXHgCNJkppjwJEkSc0x4EiSpOYYcCRJUnMMOJIkqTkGHEmS1BwDjiRJao4BR5IkNceAI0mSmmPAkSRJzTHgSJKk5hhwJElScww4kiSpOQYcSZLUHAOOJElqjgFHkiQ1x4AjSZKaY8CRJEnNMeBIkqTmGHAkSVJzDDiSJKk5BhxJktQcA44kSWqOAUeSJDXHgCNJkpoztoCT5Iokh5PcMWfZm5Pcn+RQ93X+uPYvSZI2rnH24FwJnLfA8ndU1a7u64Yx7l+SJG1QYws4VXUz8PC4ti9JkjTMJMbgvDbJ17tLWCcPa5Rkf5KDSQ4eOXKkz/okSdI613fAeS9wJrALmAbeNqxhVV1eVburaveWLVv6qk+SJDWg14BTVQ9W1WNV9b+B/wic3ef+JUnSxtBrwEmydc7bXwLuGNZWkiRpuU4Y14aTXAPsATYnuQ94E7AnyS6ggG8Dl4xr/5IkaeMaW8CpqgsWWPyBce1PkiRplk8yliRJzTHgSJKk5hhwJElScww4kiSpOQYcSZLUHAOOJElqjgFHkiQ1x4AjSZKaY8CRJEnNMeBIkqTmGHAkSVJzDDiSJKk5BhxJktQcA44kSWqOAUeSJDXHgCNJkppjwJEkSc0x4EiSpOYYcCRJUnMMOJIkqTkGHEmS1BwDjiRJao4BR5IkNeeYAk6Sk5IcP65iJEmSVsOiASfJcUkuTPLJJIeBPwemk9yZ5K1JzuqnTEmSpNEt1YNzE3Am8G+Bp1TVGVV1KvBTwJ8ClyW5aMw1SpIkHZMTllj/gqr6m/kLq+ph4Frg2iQnjqUySZKkZVq0B6eq/ibJhQBJXj6szTgKkyRJWq5RBhlvS/JS4PRxFyNJkrQalhpk/CbgFOBq4JQkv91LVZIkSSuw1CWqtwAPAxcBD1fV7/RSlSRJ0gqMconqgar6CHD/uIuRJElaDUtdonpiVX0YoKquGdZmHIVJkiQt11I9OAeSvC3JTyc5aXZhkh9JcnGSTwPnjbdESZKkY7Poc3Cq6vlJzgcuAc5NcjJwFPgGcAOwr6r+cvxlSpIkjW6pB/1RVTcwCDOSJEnrwkiTbSa5cZRlkiRJa8GiPThJngD8bWBzd3kq3apNwFPHXJskSdKyLHWJ6hLgDQzCzG38v4AzA7xnjHVJkiQt21KDjN8JvDPJ66rqXT3VJEmStCJLDjIGqKp3JXkusGPuZ6rqg2OqS5IkadlGCjhJPgScCRwCHusWF2DAkSRJa85IAQfYDeysqhp1w0muAH4eOFxVz+yWnQJ8lEFP0LeBl1bVXx1LwZIkSUsZ6TZx4A7gKce47St5/FOOLwVurKqzgBu795IkSatq1B6czcBUkj8DfjC7sKp+YdgHqurmJDvmLd4L7OleXwV8Hvg3I9YgSZI0klEDzptXaX+nVdU0QFVNJzl1WMMk+4H9ANu3b1+l3UuSpI1g1LuovjDuQhbY5+XA5QC7d+8eeeyPJEnSqFM1PJJkpvv6X0keSzKzjP09mGRrt82twOFlbEOSJGlRIwWcqnpSVW3qvp4A/BPg3cvY3/XAvu71PuDAMrYhSZK0qFHvovr/VNUfAz+zWJsk1wC3AD+W5L4kFwOXAT+b5JvAz3bvJUmSVtWoD/p78Zy3xzF4Ls6i42Kq6oIhq54/WmmSJEnLM+pdVC+a8/oog4f07V31aiRJklbBqHdRvWrchUiSJK2WUe+iOj3JdUkOJ3kwybVJTh93cZIkScsx6iDjP2BwB9RTgW3AJ7plkiRJa86oAWdLVf1BVR3tvq4EtoyxLkmSpGUbNeA8lOSiJMd3XxcB3x1nYZIkScs1asD5Z8BLgb8EpoFfBhx4LEmS1qRRbxP/98C+qvorgCSnAL/HIPhIkiStKaP24Pz92XADUFUPA88aT0mSJEkrM2rAOS7JybNvuh6cUXt/JEmSejVqSHkb8CdJPs5gioaXAr87tqokSZJWYNQnGX8wyUEGE2wGeHFVTY21MkmSpGUa+TJTF2gMNZIkac0bdQyOJEnSumHAkSRJzTHgSJKk5hhwJElScww4kiSpOQYcSZLUHAOOJElqjgFHkiQ1x4AjSZKaY8CRJEnNMeBIkqTmGHAkSVJzDDiSJKk5BhxJktQcA44kSWqOAUeSJDXHgCNJkppjwJEkSc0x4EiSpOYYcCRJUnMMOJIkqTkGHEmS1BwDjiRJas4Jky5AbZianuFl779lwXV7d23jwnO291yRJGkjM+Boxfbu2jZ03dT0DIABR5LUKwOOVuzCc7YPDTDDenUkSRonx+BIkqTmGHAkSVJzDDiSJKk5ExmDk+TbwCPAY8DRqto9iTokSVKbJjnI+HlV9dAE9y9JkhrlJSpJktScSQWcAj6T5LYk+xdqkGR/koNJDh45cqTn8iRJ0no2qYBzblX9JPBC4DVJfnp+g6q6vKp2V9XuLVu29F+hJElatyYyBqeqHui+H05yHXA2cPMkatH4LTaNAziVgyRp9fXeg5PkpCRPmn0N/BxwR991qB97d21j59ZNQ9dPTc9w4ND9PVYkSdoIJtGDcxpwXZLZ/V9dVZ+aQB3qwWLTOMBgKgcn6pQkrbbeA05V3QP8RN/71drkRJ2SpHFwsk1NlBN1SpLGwefgSJKk5hhwJElSc7xEpTXNW8wlScthwNGatdgAZHAQsiRpOAOO1qxRbjGXJGkhjsGRJEnNMeBIkqTmGHAkSVJzDDiSJKk5BhxJktQcA44kSWqOAUeSJDXHgCNJkppjwJEkSc0x4EiSpOYYcCRJUnMMOJIkqTkGHEmS1BwDjiRJao4BR5IkNceAI0mSmmPAkSRJzTHgSJKk5pww6QKklZianuFl779lwXV7d23jwnO291yRJGktMOBo3dq7a9vQdVPTMwAGHEnaoAw4WrcuPGf70AAzrFdHkrQxOAZHkiQ1x4AjSZKaY8CRJEnNMeBIkqTmGHAkSVJzDDiSJKk5G/428bd84k6mHpgZut6HxUmStP7Yg7OIqekZDhy6f9JlSJKkY7The3De9KJnDF3nw+IkSVqf7MGRJEnN2fA9OJNw9a33rujS13LHBa1kv1PTM+zcumlZn9XatdTvhGPQJK1X9uBMwIFD9//fySCP1UrGBa1kvzu3blp0ckutT4v9TjgGTdJ6Zg/OhOzcuomPXvKcY/7cSscFLXe/atew3wnHoElaz+zBkSRJzTHgSJKk5kwk4CQ5L8k3ktyd5NJJ1CBJktrVe8BJcjzwHuCFwE7ggiQ7+65DkiS1axKDjM8G7q6qewCSfATYC0xNoJYlTU3PrPpgy5Xecr3cmjbard7jOHatWep3wp+hpJXa+dRNiz5Ud1wmEXC2AX8x5/19wDnzGyXZD+wH2L59Ms/hGNdt0Su55XolNW2kW703yn/nSi32O+HPUNJ6lqrqd4fJS4B/VFW/2r1/BXB2Vb1u2Gd2795dBw8e7KtESZK0TiS5rap2z18+iUHG9wFnzHl/OvDABOqQJEmNmkTA+TJwVpKnJfkh4OXA9ROoQ5IkNar3MThVdTTJa4FPA8cDV1TVnX3XIUmS2jWRqRqq6gbghknsW5Iktc8nGUuSpOYYcCRJUnMMOJIkqTkGHEmS1BwDjiRJak7vTzJejiRHgO8ssGoz8FDP5ejxPA5rh8di7fBYrB0ei7VjHMfi71bVlvkL10XAGSbJwYUez6x+eRzWDo/F2uGxWDs8FmtHn8fCS1SSJKk5BhxJktSc9R5wLp90AQI8DmuJx2Lt8FisHR6LtaO3Y7Gux+BIkiQtZL334EiSJD2OAUeSJDVn3QScJKck+WySb3bfTx7S7ttJbk9yKMnBvutsWZLzknwjyd1JLl1gfZL8h27915P85CTq3AhGOBZ7kny/Ow8OJfntSdTZuiRXJDmc5I4h6z0nejLCsfCc6EGSM5LclOSuJHcm+fUF2vRyXqybgANcCtxYVWcBN3bvh3leVe3yuQerJ8nxwHuAFwI7gQuS7JzX7IXAWd3XfuC9vRa5QYx4LAC+2J0Hu6rqd3otcuO4EjhvkfWeE/25ksWPBXhO9OEo8MaqejrwbOA1k/p/xXoKOHuBq7rXVwG/OMFaNqKzgbur6p6q+mvgIwyOyVx7gQ/WwJ8CT06yte9CN4BRjoV6UFU3Aw8v0sRzoicjHAv1oKqmq+or3etHgLuAbfOa9XJerKeAc1pVTcPgBwicOqRdAZ9JcluS/b1V175twF/MeX8fj/+lHaWNVm7Un/NzknwtyX9J8ox+StM8nhNri+dEj5LsAJ4F3DpvVS/nxQmrvcGVSPI54CkLrPqtY9jMuVX1QJJTgc8m+fMu2WtlssCy+c8YGKWNVm6Un/NXGMzP8miS84E/ZtAdrH55TqwdnhM9SvJE4FrgDVU1M3/1Ah9Z9fNiTfXgVNULquqZC3wdAB6c7cLqvh8eso0Huu+HgesYdOdr5e4Dzpjz/nTggWW00cot+XOuqpmqerR7fQNwYpLN/ZWojufEGuE50Z8kJzIINx+uqj9aoEkv58WaCjhLuB7Y173eBxyY3yDJSUmeNPsa+DlgwRH1OmZfBs5K8rQkPwS8nMExmet64JXdCPlnA9+fvayoVbXksUjylCTpXp/N4Fz/bu+VynNijfCc6Ef3M/4AcFdVvX1Is17OizV1iWoJlwEfS3IxcC/wEoAkTwX+U1WdD5wGXNf9Dp8AXF1Vn5pQvU2pqqNJXgt8GjgeuKKq7kzy6m79+4AbgPOBu4H/AbxqUvW2bMRj8cvAv0hyFPifwMvLx5avuiTXAHuAzUnuA94EnAieE30b4Vh4TvTjXOAVwO1JDnXLfhPYDv2eF07VIEmSmrOeLlFJkiSNxIAjSZKaY8CRJEnNMeBIkqTmGHAkSVLvlpogdV7b7d0knl/tJug8f6nPGHAkSdIkXMnSE6TO+nfAx6rqWQye/fX7S33AgCPpmCR5cpJfW2T9n6zCPn4lybu7169O8spF2u5J8tyV7lNSvxaaIDXJmUk+1c0n+cUkPz7bHNjUvf5hRnjy8Xp60J+kteHJwK8x719QSY6vqseqalXDRvdgsMXsAR4FVhysJE3c5cCrq+qbSc5h8HfmZ4A3M5hI+3XAScALltqQPTiSjtVlwJlJDiX5cndd/GrgdoAkj3bf9yS5Ocl1SaaSvC/J0L85SV6V5L8l+QKDp6HOLn9zkt/oXr++29bXk3ykm6341cC/7Or5qSQvSnJrd63+c0lOm7OdK5J8Psk9SV4/Zx+v7Lb5tSQf6pZtSXJt99/45STnImlsugk6nwv8YfcU5PcDW7vVFwBXVtXpDJ6C/KHF/p6APTiSjt2lwDOraleSPcAnu/ffWqDt2cBO4DvAp4AXAx+f36ibQPctwD8Avg/cBHx1yL6fVlU/SPLkqvpekvcBj1bV73XbOhl4dlVVkl8F/jXwxu7zPw48D3gS8I0k7wV+FPgt4NyqeijJKV3bdwLvqKovJdnOYGqMp4/+Y5J0jI4DvldVuxZYdzHdeJ2quiXJE4DNDJl4e3ZjkrQSfzYk3Myuu6eqHgOuAf7hkHbnAJ+vqiNV9dfAR4e0+zrw4SQXAUeHtDkd+HSS24F/BTxjzrpPVtUPquohBn8YT2PQ/f3xbhlVNTsm4AXAu7t/SV4PbEo3ma+k1VdVM8C3kszONZkkP9Gtvhd4frf86cATgCOLbc+AI2ml/vsi6+ZPdrfY5HejTIz3j4H3MOjpuS3JQr3Q7wLeXVV/D7iEwR/CWT+Y8/oxBr3YGbLv44DnVNWu7mtbVT0yQo2SRtBNkHoL8GNJ7usm0/6nwMVJvgbcCeztmr8R+Ofd8muAX1lqslQvUUk6Vo8wuMQzirOTPI3BJaqXMRhAuJBbgXcm+TvADPAS4GtzG3TX28+oqpuSfAm4EHhiV8+mOU1/GLi/e71vhBpvBK5L8o6q+m6SU7penM8ArwXe2u1/V1UdWmxDkkZXVRcMWfW4W8eraoo5Y/NGYQ+OpGNSVd8F/mv3cK63LtH8FgaDku8AvgVcN2Sb0wzukrgF+BzwlQWaHQ/85+7S01cZjI/5HvAJ4JdmBxl32/nDJF8EHhrhv+dO4HeBL3T/Onx7t+r1wO5u8PEUg8HMktaJLNHDI0nL0g1A/o2q+vlJ1yJp47EHR5IkNcceHEm9SnIr8LfmLX5FVd0+iXoktcmAI0mSmuMlKkmS1BwDjiRJao4BR5IkNceAI0mSmvN/ACcm0YXlljOXAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 576x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.figure(figsize=(8,4))\n",
    "df.plot1d('trip_distance', limits='minmax', f='log1p', progress='widget')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:09:00.074906Z",
     "start_time": "2020-06-10T17:08:59.165113Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array(7844544)"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# How many trips have 0.0 distance?\n",
    "(df.trip_distance==0).astype('int').sum()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:09:07.232602Z",
     "start_time": "2020-06-10T17:09:06.772980Z"
    }
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "9a498cb4e2c04880aa461f845009bef1",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1.0), Label(value='In progress...')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "The maximum trip distance in the data is 198623008.0 miles\n",
      "\n",
      "This is 831.4 times larger than the distance between the Earth and the Moon!\n",
      "or\n",
      "This is 5.9 times the distance to Mars!\n"
     ]
    }
   ],
   "source": [
    "# What is the largest distance?\n",
    "_ = df.trip_distance.max(progress='widget')\n",
    "print()\n",
    "print(f'The maximum trip distance in the data is {_} miles')\n",
    "print()\n",
    "print('This is %3.1f times larger than the distance between the Earth and the Moon!' % (_ / 238_900))\n",
    "print('or')\n",
    "print('This is %1.1f times the distance to Mars!' % (_ / 33_900_000))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:09:25.949148Z",
     "start_time": "2020-06-10T17:09:25.406582Z"
    }
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "b4beb5a69f974df1a9615d8e042a71a1",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1.0), Label(value='In progress...')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAjgAAAEYCAYAAABRMYxdAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+j8jraAAAcC0lEQVR4nO3de7Ckd13n8feHSSIlMIQwI45nMpuRCurIwoiHDBcvw0VNIjhiKbmgXIQKqSWoW2tJttwlUWqrQETNcpuMbIzckqgBEnUwrCwa3VzMBIeQHCSMCSaTHMmEKINYBQ5894/ucZsz3X36nDlP9zlPv19VXaef5/d7ur9Pnu70Z37PLVWFJElSmzxq0gVIkiStNAOOJElqHQOOJElqHQOOJElqHQOOJElqHQOOJElqnTUZcJJckeShJHeO0HdLkk8k+dskdyQ5exw1SpKkyVmTAQe4EjhzxL7/DfiDqvo+4FzgXU0VJUmSVoc1GXCq6kbgkd55SZ6c5M+S3J7kr5J899HuwPru88cDD46xVEmSNAEnTLqAFbQHuLCqPpdkB52RmucDlwIfS/J64DHACydXoiRJGodWBJwkjwWeA/xhkqOzv6X79zzgyqp6W5JnA+9L8tSq+sYESpUkSWPQioBDZ1fbP1fV9j5tr6Z7vE5V3Zzk0cAG4KEx1idJksZoTR6Ds1BVHQbuTfIzAOl4erf5PuAF3fnfAzwaODSRQiVJ0lhkLd5NPMlVwE46IzFfAC4B/g/wbmATcCJwdVX9epJtwO8Cj6VzwPGvVNXHJlG3JEkajzUZcCRJkoZpxS4qSZKkXmvuIOMNGzbUaaedNukyJEnSKnD77bc/XFUbF85fcwHntNNOY9++fZMuQ5IkrQJJ/qHffHdRSZKk1jHgSJKk1jHgSJKk1jHgSJKk1jHgSJKk1jHgSJKk1jHgSJKk1jHgSJKk1llzF/pbiz54631ct/+Bge27ts9w/o4tY6xIkqR2cwRnDK7b/wBz84f7ts3NHx4afiRJ0tI1NoKT5ArgRcBDVfXUIf2eCdwCnFNVf9RUPZO2bdN6rnnts4+Zf87lN0+gGkmS2q3JEZwrgTOHdUiyDngLcEODdUiSpCnTWMCpqhuBRxbp9nrgWuChpuqQJEnTZ2IHGSeZAV4CPB945iJ9LwAuANiyZfUdjLvYQcRz84fZtmn9GCuSJGm6TfIg498B3lBVX1+sY1XtqarZqprduHHjGEpbmmEHEUPn+Jtd22fGWJEkSdNtkqeJzwJXJwHYAJyd5EhVfWSCNS3boIOIJUnS+E0s4FTV1qPPk1wJ/MlaDTeSJGl1afI08auAncCGJAeBS4ATAapqd1PvK0mS1FjAqarzltD3lU3VIUmSpo9XMpYkSa1jwJEkSa1jwJEkSa1jwJEkSa1jwJEkSa1jwJEkSa1jwJEkSa1jwJEkSa1jwJEkSa1jwJEkSa1jwJEkSa1jwJEkSa1jwJEkSa1jwJEkSa1zwqQLEMzNH+acy2/u27Zr+wzn79gy5ookSVrbDDgTtmv7zMC2ufnDAAYcSZKWyIAzYefv2DIwwAwa1ZEkScN5DI4kSWodA44kSWqdxgJOkiuSPJTkzgHtL0tyR/dxU5KnN1WLJEmaLk2O4FwJnDmk/V7gh6vqacCbgD0N1iJJkqZIYwcZV9WNSU4b0n5Tz+QtwOamapEkSdNltRyD82rgo4Mak1yQZF+SfYcOHRpjWZIkaS2aeMBJ8jw6AecNg/pU1Z6qmq2q2Y0bN46vOEmStCZN9Do4SZ4GvAc4q6q+OMlaJElSe0xsBCfJFuBDwM9V1d2TqkOSJLVPYyM4Sa4CdgIbkhwELgFOBKiq3cAbgScC70oCcKSqZpuqR5IkTY8mz6I6b5H21wCvaer9JUnS9Jr4QcaSJEkrzYAjSZJax4AjSZJax4AjSZJax4AjSZJax4AjSZJax4AjSZJax4AjSZJax4AjSZJax4AjSZJax4AjSZJax4AjSZJax4AjSZJax4AjSZJax4AjSZJax4AjSZJax4AjSZJax4AjSZJa54RJF6Dh5uYPc87lNw9s37V9hvN3bBljRZIkrX4GnFVs1/aZoe1z84cBDDiSJC3QWMBJcgXwIuChqnpqn/YAlwFnA/8KvLKqPtlUPWvR+Tu2DA0vw0Z2JEmaZk0eg3MlcOaQ9rOA07uPC4B3N1iLJEmaIo0FnKq6EXhkSJddwHur4xbg5CSbmqpHkiRNj0meRTUD3N8zfbA77xhJLkiyL8m+Q4cOjaU4SZK0dk0y4KTPvOrXsar2VNVsVc1u3Lix4bIkSdJaN8mAcxA4tWd6M/DghGqRJEktMsmAcz3w8nQ8C/hSVc1PsB5JktQSTZ4mfhWwE9iQ5CBwCXAiQFXtBvbSOUX8AJ3TxF/VVC2SJGm6NBZwquq8RdoLeF1T7y9JkqaX96KSJEmtY8CRJEmtY8CRJEmtY8CRJEmtY8CRJEmtY8CRJEmtY8CRJEmtY8CRJEmtY8CRJEmtY8CRJEmtY8CRJEmtY8CRJEmtY8CRJEmtY8CRJEmtY8CRJEmtY8CRJEmts6SAk+QxSdY1VYwkSdJKOGFYY5JHAecCLwOeCXwV+JYkh4C9wJ6q+lzjVWqgufnDnHP5zX3bdm2f4fwdW8ZckSRJkzc04ACfAP4c+K/AnVX1DYAkpwDPA96c5MNV9f5my1Q/u7bPDGybmz8MYMCRJE2lxQLOC6vq3xbOrKpHgGuBa5OcOGjhJGcClwHrgPdU1ZsXtD8eeD+wpVvLb1bV7y1tFcbjg7fex3X7H+jbNjd/mG2b1o+5ok54GRRgBo3qSJI0DYYeg1NV/5bkfIAk5w7q029+91iddwJnAduA85JsW9DtdcBcVT0d2Am8LclJS1qDMblu/wP/Piqy0LZN64eOpkiSpPFabAQHYCbJS4HNS3ztM4ADVXUPQJKrgV3AXE+fAh6XJMBjgUeAI0t8n7HZtmk917z22ZMuQ5IkLWLoCE6SS4BTgA8CpyR54xJeewa4v2f6YHder3cA3wM8CHwa+MWjx/ksqOOCJPuS7Dt06NASSpAkSdNosV1Uv0ZnVOVngUeq6teX8Nrp95ILpn8M2A98B7AdeEeSYw5mqao9VTVbVbMbN25cQgmSJGkajXIdnAer6mqg/xG2gx0ETu2Z3kxnpKbXq4APVccB4F7gu5f4PpIkSd9ksV1Uj62qDwBU1VWD+gxY/Dbg9CRbuwcOnwtcv6DPfcALuq/zJOC7gHtGL1+SJOlYi43gXJfkbUl+KMljjs5M8p1JXp3kBuDMfgtW1RHgIuAG4DPAH1TVXUkuTHJht9ubgOck+TTwceANVfXw8a6UJEmabkPPoqqqFyQ5G3gt8NwkT6BzltNn6VzJ+BVV9Y9Dlt/b7dc7b3fP8weBH11++ZIkScda9DTxfiFFkiRpNRvpZptJPj7KPEmSpNVgsZttPhr4VmBDd/fU0VO/19M5tVuSJGnVWWwX1WuBX6ITZm7n/wecw3RuwyBJkrTqLHaQ8WXAZUleX1VvH1NNkiRJx2WUe1FRVW9P8hzgtN5lquq9DdUlSZK0bCMFnCTvA55M57YKX+/OLsCAI0mSVp2RAg4wC2yrqoX3kpIkSVp1RjpNHLgT+PYmC5EkSVopo47gbADmkvwN8NWjM6vqJxqpSpIk6TiMGnAubbIISZKklTTqWVR/2XQhkiRJK2XUs6i+TOesKYCTgBOBr1TV+qYKkyRJWq5RR3Ae1zud5CeBMxqpSCtmbv4w51x+88D2XdtnOH/HljFWJEnSeIx6DM43qaqPJLl4pYvRytm1fWZo+9z8YQADjiSplUbdRfVTPZOPonNdHK+Js4qdv2PL0PAybGRHkqS1btQRnBf3PD8CfB7YteLVSJIkrYBRj8F5VdOFSJIkrZSRrmScZHOSDyd5KMkXklybZHPTxUmSJC3HqLuofg/4IPAz3emf7c77kSaK0ngMO8vKM6wkSWvZqPei2lhVv1dVR7qPK4GNiy2U5Mwkn01yYNBZV0l2Jtmf5K4kXlBwTHZtn2Hbpv6XMZqbP8x1+x8Yc0WSJK2cUUdwHk7ys8BV3enzgC8OWyDJOuCddEZ5DgK3Jbm+quZ6+pwMvAs4s6ruS/JtS10BLc+ws6w8w0qStNaNOoLz88BLgX8E5oGfBhY78PgM4EBV3VNVXwOu5tgzr84HPlRV9wFU1UOjFi5JkjTIqAHnTcArqmpjVX0bncBz6SLLzAD390wf7M7r9RTgCUn+IsntSV7e74WSXJBkX5J9hw4dGrFkSZI0rUYNOE+rqn86OlFVjwDft8gy6TNv4cUBTwC+H/hx4MeA/57kKccsVLWnqmaranbjxkUP/ZEkSVNu1GNwHpXkCUdDTpJTRlj2IHBqz/Rm4ME+fR6uqq8AX0lyI/B04O4R65IkSTrGqCM4bwNuSvKmJL8O3AT8xiLL3AacnmRrkpOAc4HrF/S5DvjBJCck+VZgB/CZ0cuXJEk61qhXMn5vkn3A8+nsevqp3rOhBixzJMlFwA3AOuCKqroryYXd9t1V9ZkkfwbcAXwDeE9V3Xkc6yNJkjT63cS7gWZoqOmzzF5g74J5uxdMvxV461JeV5IkaZhRd1FJkiStGQYcSZLUOgYcSZLUOgYcSZLUOgYcSZLUOgYcSZLUOgYcSZLUOgYcSZLUOgYcSZLUOgYcSZLUOgYcSZLUOgYcSZLUOgYcSZLUOgYcSZLUOgYcSZLUOgYcSZLUOgYcSZLUOgYcSZLUOidMugCtTnPzhznn8pv7tu3aPsP5O7aMuSJJkkbX6AhOkjOTfDbJgSQXD+n3zCRfT/LTTdaj0ezaPsO2Tev7ts3NH+a6/Q+MuSJJkpamsRGcJOuAdwI/AhwEbktyfVXN9en3FuCGpmrR0py/Y8vAEZpBozqSJK0mTY7gnAEcqKp7quprwNXArj79Xg9cCzzUYC2SJGmKNBlwZoD7e6YPduf9uyQzwEuA3Q3WIUmSpkyTASd95tWC6d8B3lBVXx/6QskFSfYl2Xfo0KEVK1CSJLVTk2dRHQRO7ZneDDy4oM8scHUSgA3A2UmOVNVHejtV1R5gD8Ds7OzCkCRJkvRNmgw4twGnJ9kKPACcC5zf26Gqth59nuRK4E8WhhtJkqSlaizgVNWRJBfROTtqHXBFVd2V5MJuu8fdSJKkRjR6ob+q2gvsXTCvb7Cpqlc2WYskSZoe3qpBkiS1jrdq0JINu40DeCsHSdLkGXC0JLu2zwxtn5s/DGDAkSRNlAFHSzLsNg7grRwkSauDx+BIkqTWMeBIkqTWMeBIkqTWMeBIkqTWMeBIkqTWMeBIkqTW8TRxrbhhFwL0IoCSpHEw4GhFDbsQoBcBlCSNiwGnx6/98V3MPXi4b9vc/GG2bVo/5orWnmEXAvQigJKkcfEYnBFt27R+0dsUSJKk1cERnB6XvPh7J12CJElaAY7gSJKk1nEER2PlGVaSpHEw4GhsPMNKkjQuBhyNjWdYSZLGxWNwJElS6zQacJKcmeSzSQ4kubhP+8uS3NF93JTk6U3WI0mSpkNjASfJOuCdwFnANuC8JNsWdLsX+OGqehrwJmBPU/VIkqTp0eQIzhnAgaq6p6q+BlwN7OrtUFU3VdU/dSdvATY3WI8kSZoSTR5kPAPc3zN9ENgxpP+rgY/2a0hyAXABwJYtnmXTVsNOIQdPI5ckja7JgJM+86pvx+R5dALOD/Rrr6o9dHdfzc7O9n0NrW2L3QbD08glSUvRZMA5CJzaM70ZeHBhpyRPA94DnFVVX2ywHq1iw04hB08jlyQtTZPH4NwGnJ5ka5KTgHOB63s7JNkCfAj4uaq6u8FaJEnSFGlsBKeqjiS5CLgBWAdcUVV3Jbmw274beCPwROBdSQCOVNVsUzVpbfM2D5KkUTV6JeOq2gvsXTBvd8/z1wCvabIGtYO3eZAkLYW3atCasNzbPHzw1vu4bv8DQ1/b0R9Jah8Djlph0O6rW+99BIAdW08ZuBw4+iNJbWPA0Zo3bPfVjq2nDB2h8ewsSWonA47WvMVOMZckTR/vJi5JklrHgCNJklrHXVSael5fR5Lax4Cjqeb1dSSpnVK1tu5dOTs7W/v27Zt0GZoC51x+M3Pzh9m2aX3fdkd3JGnyktze7y4IjuBIAzi6I0lrlyM40jIsNroDjvBI0jg4giOtoGGjO9C5gvKt9z4y8DYRhh9JapYBR1qGxS4uOOweWO7ekqTmuYtKGjMPXpakleMuKmmVGLZ7a7FdW0eX7xeAFrtzusFJ0jQx4EhjNmz31mIhZVgAGnbn9CZ3iy1W8zDLDV2GOUmLcReVtIYs94d9lLO+lmtYsGpiucWWPbqe17z22Ut+XUlrz6BdVAYcaQoczyjLKJYzYnK8NTUV5hz9kdYWA46kqXA8wel4RpXAcKTRjPIZ9bM0OgOOJC1ikuFoufwhnJzlfl4W+6wcz2dpGj8PEwk4Sc4ELgPWAe+pqjcvaE+3/WzgX4FXVtUnh72mAUfSatT0bsB+mgxVbfqhbGrbNBVEmgpOx2M1fx7GHnCSrAPuBn4EOAjcBpxXVXM9fc4GXk8n4OwALquqHcNe14AjSR2r8Yd7NZqWH/7V+nnY9h3rueTF37uSJX2TSVwH5wzgQFXd0y3gamAXMNfTZxfw3uqkrFuSnJxkU1XNN1iXJLXCYlfUXq5JjEY1acfWU1ZVEGmKn4dv1mTAmQHu75k+SGeUZrE+M8A3BZwkFwAXAGzZ0u4PqCRNWlM/lFqb1urn4VENvnb6zFu4P2yUPlTVnqqararZjRs3rkhxkiSpvZoMOAeBU3umNwMPLqOPJEnSkjQZcG4DTk+yNclJwLnA9Qv6XA+8PB3PAr7k8TeSJOl4NXYMTlUdSXIRcAOd08SvqKq7klzYbd8N7KVzBtUBOqeJv6qpeiRJ0vRo9GabVbWXTojpnbe753kBr2uyBkmSNH2a3EUlSZI0EQYcSZLUOgYcSZLUOmvuZptJDgH/0OBbbAAebvD1V5NpWddpWU9wXdtqWtZ1WtYTXNeV9B+q6piL5K25gNO0JPv63dOijaZlXadlPcF1batpWddpWU9wXcfBXVSSJKl1DDiSJKl1DDjH2jPpAsZoWtZ1WtYTXNe2mpZ1nZb1BNe1cR6DI0mSWscRHEmS1DoGHEmS1DpTGXCSnJnks0kOJLm4T3uS/M9u+x1JnjGJOo9XklOTfCLJZ5LcleQX+/TZmeRLSfZ3H2+cRK0rIcnnk3y6ux77+rS3Zbt+V8/22p/kcJJfWtBnzW7XJFckeSjJnT3zTknyv5N8rvv3CQOWHfrdXm0GrOtbk/xd9zP64SQnD1h26Od9NRmwnpcmeaDnM3r2gGXbsE2v6VnPzyfZP2DZNbNNYfBvzKr5vlbVVD3o3Nn874HvBE4CPgVsW9DnbOCjQIBnAbdOuu5lrusm4Bnd548D7u6zrjuBP5l0rSu0vp8HNgxpb8V2XbBO64B/pHOhq1ZsV+CHgGcAd/bM+w3g4u7zi4G3DPhvMfS7vdoeA9b1R4ETus/f0m9du21DP++r6TFgPS8FfnmR5VqxTRe0vw1441rfpt16+/7GrJbv6zSO4JwBHKiqe6rqa8DVwK4FfXYB762OW4CTk2wad6HHq6rmq+qT3edfBj4DzEy2qolqxXZd4AXA31dVk1f3HququhF4ZMHsXcDvd5//PvCTfRYd5bu9qvRb16r6WFUd6U7eAmwee2ErbMA2HUUrtulRSQK8FLhqrEU1ZMhvzKr4vk5jwJkB7u+ZPsixP/qj9FlTkpwGfB9wa5/mZyf5VJKPJvnesRa2sgr4WJLbk1zQp7112xU4l8H/s2zLdgV4UlXNQ+d/qsC39enTxu3783RGHftZ7PO+FlzU3RV3xYDdGG3bpj8IfKGqPjegfc1u0wW/Mavi+zqNASd95i08V36UPmtGkscC1wK/VFWHFzR/ks7ujacDbwc+Mu76VtBzq+oZwFnA65L80IL2tm3Xk4CfAP6wT3Obtuuo2rZ9fxU4AnxgQJfFPu+r3buBJwPbgXk6u24WatU2Bc5j+OjNmtymi/zGDFysz7wV3bbTGHAOAqf2TG8GHlxGnzUhyYl0PngfqKoPLWyvqsNV9S/d53uBE5NsGHOZK6KqHuz+fQj4MJ0h0F6t2a5dZwGfrKovLGxo03bt+sLR3Yndvw/16dOa7ZvkFcCLgJdV94CFhUb4vK9qVfWFqvp6VX0D+F3619+mbXoC8FPANYP6rMVtOuA3ZlV8X6cx4NwGnJ5ka/dfwOcC1y/ocz3w8u5ZN88CvnR0uG0t6e7v/V/AZ6rqtwb0+fZuP5KcQecz8cXxVbkykjwmyeOOPqdzoOadC7q1Yrv2GPivwbZs1x7XA6/oPn8FcF2fPqN8t1e9JGcCbwB+oqr+dUCfUT7vq9qC499eQv/6W7FNu14I/F1VHezXuBa36ZDfmNXxfZ30UdiTeNA5m+ZuOkdw/2p33oXAhd3nAd7Zbf80MDvpmpe5nj9AZ8jvDmB/93H2gnW9CLiLzhHstwDPmXTdy1zX7+yuw6e669Pa7dpdl2+lE1ge3zOvFduVTmibB/6Nzr/yXg08Efg48Lnu31O6fb8D2Nuz7DHf7dX8GLCuB+gcm3D0O7t74boO+ryv1seA9Xxf93t4B50ftk1t3abd+Vce/X729F2z27Rb86DfmFXxffVWDZIkqXWmcReVJElqOQOOJElqHQOOJElqHQOOJElqHQOOJElqHQOOJElqHQOOpGVJcnKS/zSk/aYVeI9XJnlH9/mFSV4+pO/OJM853veU1A4GHEnLdTJwTMBJsg6gqlY0bFTV7qp675AuOwEDjiTAgCNp+d4MPDnJ/iS3JflEkg/SuTotSf6l+3dnkhuTfDjJXJLdSQb+vyfJq5LcneQvgef2zL80yS93n/9C97XuSHJ1907GFwL/uVvPDyZ5cZJbk/xtkj9P8qSe17kiyV8kuSfJL/S8x8u7r/mpJO/rztuY5NruOt6W5LlIWvVOmHQBktasi4GnVtX2JDuBP+1O39un7xnANuAfgD+jc9PBP1rYqXt/ol8Dvh/4EvAJ4G8HvPfWqvpqkpOr6p+T7Ab+pap+s/taTwCeVVWV5DXArwD/pbv8dwPPAx4HfDbJu4GnAL9K547ODyc5pdv3MuC3q+qvk2wBbgC+Z/T/TJImwYAjaaX8zYBwc7TtHoAkV9G5h80xAQfYAfxFVR3q9r2GTvBY6A7gA0k+AnxkwHtuBq7phqaTgN7a/rSqvgp8NclDwJOA5wN/VFUPA1TVI92+LwS2de9dCrA+yeOq6ssD3lfSKuAuKkkr5StD2hbe9G7YTfBGuUHej9O5cer3A7cn6fePtbcD76iq/wi8Fnh0T9tXe55/nc4/9jLgvR8FPLuqtncfM4YbafUz4Ehari/T2cUzijOSbO0ee3MO8NcD+t0K7EzyxCQnAj+zsEP3NU6tqk/Q2e10MvDYPvU8Hnig+/wVI9T4ceClSZ7YfZ+ju6g+Rufu7Efff/sIryVpwgw4kpalqr4I/N8kdwJvXaT7zXQOSr6Tzq6iDw94zXng0m7/Pwc+2afbOuD9ST5N5/ic366qfwb+GHjJ0YOMu6/zh0n+Cnh4hPW5C/gfwF8m+RTwW92mXwBmuwcfz9E5mFnSKpeqUUaDJWl5ugcg/3JVvWjStUiaHo7gSJKk1nEER9JEJLkV+JYFs3+uqj49iXoktYsBR5IktY67qCRJUusYcCRJUusYcCRJUusYcCRJUuv8Pyk5IzuS9xQBAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 576x288 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.figure(figsize=(8,4))\n",
    "df.plot1d('trip_distance', limits=[0, 20], f=None, progress='widget')\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:09:40.792027Z",
     "start_time": "2020-06-10T17:09:40.569328Z"
    }
   },
   "outputs": [],
   "source": [
    "# Filter negative and too large distances\n",
    "df = df[(df.trip_distance>0) & (df.trip_distance<10)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### What _is_ New York City really?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:09:58.347172Z",
     "start_time": "2020-06-10T17:09:55.523163Z"
    }
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "b2b15c882e574652840e2a43542c248d",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Heatmap(children=[ToolsToolbar(interact_value=None, supports_normalize=False, template='<template>\\n  <v-toolb…"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Interactively plot the pickup locations\n",
    "df.plot_widget(df.pickup_longitude, \n",
    "               df.pickup_latitude, \n",
    "               shape=512, \n",
    "               f='log1p', \n",
    "               colormap='plasma', \n",
    "               limits='minmax')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:11:38.888431Z",
     "start_time": "2020-06-10T17:11:38.681731Z"
    }
   },
   "outputs": [],
   "source": [
    "# Define the NYC boundaries\n",
    "long_min = -74.05\n",
    "long_max = -73.75\n",
    "lat_min = 40.58\n",
    "lat_max = 40.90\n",
    "\n",
    "# Make a selection based on the boundaries\n",
    "df = df[(df.pickup_longitude > long_min)  & (df.pickup_longitude < long_max) & \\\n",
    "        (df.pickup_latitude > lat_min)    & (df.pickup_latitude < lat_max) & \\\n",
    "        (df.dropoff_longitude > long_min) & (df.dropoff_longitude < long_max) & \\\n",
    "        (df.dropoff_latitude > lat_min)   & (df.dropoff_latitude < lat_max)]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Create some date/time features"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:12:11.624745Z",
     "start_time": "2020-06-10T17:11:56.039813Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead>\n",
       "<tr><th>#                                        </th><th>vendor_id  </th><th>pickup_datetime              </th><th>dropoff_datetime             </th><th>passenger_count  </th><th>payment_type  </th><th>trip_distance     </th><th>pickup_longitude  </th><th>pickup_latitude   </th><th>rate_code  </th><th>store_and_fwd_flag  </th><th>dropoff_longitude  </th><th>dropoff_latitude  </th><th>fare_amount       </th><th>surcharge  </th><th>mta_tax  </th><th>tip_amount        </th><th>tolls_amount  </th><th>total_amount      </th><th>tip_percentage     </th><th>pickup_hour  </th><th>pickup_day_of_week  </th><th>pickup_is_weekend  </th></tr>\n",
       "</thead>\n",
       "<tbody>\n",
       "<tr><td><i style='opacity: 0.6'>0</i>            </td><td>VTS        </td><td>2009-01-04 02:52:00.000000000</td><td>2009-01-04 03:02:00.000000000</td><td>1                </td><td>CASH          </td><td>2.630000114440918 </td><td>-73.99195861816406</td><td>40.72156524658203 </td><td>nan        </td><td>nan                 </td><td>-73.99380493164062 </td><td>40.6959228515625  </td><td>8.899999618530273 </td><td>0.5        </td><td>nan      </td><td>0.0               </td><td>0.0           </td><td>9.399999618530273 </td><td>0.0                </td><td>2            </td><td>6                   </td><td>1                  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1</i>            </td><td>VTS        </td><td>2009-01-04 03:31:00.000000000</td><td>2009-01-04 03:38:00.000000000</td><td>3                </td><td>Credit        </td><td>4.550000190734863 </td><td>-73.98210144042969</td><td>40.736289978027344</td><td>nan        </td><td>nan                 </td><td>-73.95584869384766 </td><td>40.768028259277344</td><td>12.100000381469727</td><td>0.5        </td><td>nan      </td><td>2.0               </td><td>0.0           </td><td>14.600000381469727</td><td>0.13698630034923553</td><td>3            </td><td>6                   </td><td>1                  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>2</i>            </td><td>DDS        </td><td>2009-01-01 20:52:58.000000000</td><td>2009-01-01 21:14:00.000000000</td><td>1                </td><td>CREDIT        </td><td>5.0               </td><td>-73.9742660522461 </td><td>40.79095458984375 </td><td>nan        </td><td>nan                 </td><td>-73.9965591430664  </td><td>40.731849670410156</td><td>14.899999618530273</td><td>0.5        </td><td>nan      </td><td>3.049999952316284 </td><td>0.0           </td><td>18.450000762939453</td><td>0.16531164944171906</td><td>20           </td><td>3                   </td><td>0                  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>3</i>            </td><td>DDS        </td><td>2009-01-24 16:18:23.000000000</td><td>2009-01-24 16:24:56.000000000</td><td>1                </td><td>CASH          </td><td>0.4000000059604645</td><td>-74.00157928466797</td><td>40.719383239746094</td><td>nan        </td><td>nan                 </td><td>-74.00837707519531 </td><td>40.7203483581543  </td><td>3.700000047683716 </td><td>0.0        </td><td>nan      </td><td>0.0               </td><td>0.0           </td><td>3.700000047683716 </td><td>0.0                </td><td>16           </td><td>5                   </td><td>1                  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>4</i>            </td><td>DDS        </td><td>2009-01-16 22:35:59.000000000</td><td>2009-01-16 22:43:35.000000000</td><td>2                </td><td>CASH          </td><td>1.2000000476837158</td><td>-73.98980712890625</td><td>40.73500442504883 </td><td>nan        </td><td>nan                 </td><td>-73.98502349853516 </td><td>40.72449493408203 </td><td>6.099999904632568 </td><td>0.5        </td><td>nan      </td><td>0.0               </td><td>0.0           </td><td>6.599999904632568 </td><td>0.0                </td><td>22           </td><td>4                   </td><td>0                  </td></tr>\n",
       "<tr><td>...                                      </td><td>...        </td><td>...                          </td><td>...                          </td><td>...              </td><td>...           </td><td>...               </td><td>...               </td><td>...               </td><td>...        </td><td>...                 </td><td>...                </td><td>...               </td><td>...               </td><td>...        </td><td>...      </td><td>...               </td><td>...           </td><td>...               </td><td>...                </td><td>...          </td><td>...                 </td><td>...                </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,083,167,545</i></td><td>VTS        </td><td>2015-12-31 23:59:56.000000000</td><td>2016-01-01 00:08:18.000000000</td><td>5                </td><td>1             </td><td>1.2000000476837158</td><td>-73.99381256103516</td><td>40.72087097167969 </td><td>1.0        </td><td>0.0                 </td><td>-73.98621368408203 </td><td>40.722469329833984</td><td>7.5               </td><td>0.5        </td><td>0.5      </td><td>1.7599999904632568</td><td>0.0           </td><td>10.5600004196167  </td><td>0.1666666567325592 </td><td>23           </td><td>3                   </td><td>0                  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,083,167,546</i></td><td>CMT        </td><td>2015-12-31 23:59:58.000000000</td><td>2016-01-01 00:05:19.000000000</td><td>2                </td><td>2             </td><td>2.0               </td><td>-73.96527099609375</td><td>40.76028060913086 </td><td>1.0        </td><td>0.0                 </td><td>-73.93951416015625 </td><td>40.75238800048828 </td><td>7.5               </td><td>0.5        </td><td>0.5      </td><td>0.0               </td><td>0.0           </td><td>8.800000190734863 </td><td>0.0                </td><td>23           </td><td>3                   </td><td>0                  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,083,167,547</i></td><td>CMT        </td><td>2015-12-31 23:59:59.000000000</td><td>2016-01-01 00:12:55.000000000</td><td>2                </td><td>2             </td><td>3.799999952316284 </td><td>-73.98729705810547</td><td>40.739078521728516</td><td>1.0        </td><td>0.0                 </td><td>-73.9886703491211  </td><td>40.69329833984375 </td><td>13.5              </td><td>0.5        </td><td>0.5      </td><td>0.0               </td><td>0.0           </td><td>14.800000190734863</td><td>0.0                </td><td>23           </td><td>3                   </td><td>0                  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,083,167,548</i></td><td>VTS        </td><td>2015-12-31 23:59:59.000000000</td><td>2016-01-01 00:10:26.000000000</td><td>1                </td><td>2             </td><td>1.9600000381469727</td><td>-73.99755859375   </td><td>40.72569274902344 </td><td>1.0        </td><td>0.0                 </td><td>-74.01712036132812 </td><td>40.705322265625   </td><td>8.5               </td><td>0.5        </td><td>0.5      </td><td>0.0               </td><td>0.0           </td><td>9.800000190734863 </td><td>0.0                </td><td>23           </td><td>3                   </td><td>0                  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,083,167,549</i></td><td>VTS        </td><td>2015-12-31 23:59:59.000000000</td><td>2016-01-01 00:21:30.000000000</td><td>1                </td><td>1             </td><td>1.059999942779541 </td><td>-73.9843978881836 </td><td>40.76725769042969 </td><td>1.0        </td><td>0.0                 </td><td>-73.99098205566406 </td><td>40.76057052612305 </td><td>13.5              </td><td>0.5        </td><td>0.5      </td><td>2.9600000381469727</td><td>0.0           </td><td>17.760000228881836</td><td>0.1666666716337204 </td><td>23           </td><td>3                   </td><td>0                  </td></tr>\n",
       "</tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "#              vendor_id    pickup_datetime                dropoff_datetime               passenger_count    payment_type    trip_distance       pickup_longitude    pickup_latitude     rate_code    store_and_fwd_flag    dropoff_longitude    dropoff_latitude    fare_amount         surcharge    mta_tax    tip_amount          tolls_amount    total_amount        tip_percentage       pickup_hour    pickup_day_of_week    pickup_is_weekend\n",
       "0              VTS          2009-01-04 02:52:00.000000000  2009-01-04 03:02:00.000000000  1                  CASH            2.630000114440918   -73.99195861816406  40.72156524658203   nan          nan                   -73.99380493164062   40.6959228515625    8.899999618530273   0.5          nan        0.0                 0.0             9.399999618530273   0.0                  2              6                     1\n",
       "1              VTS          2009-01-04 03:31:00.000000000  2009-01-04 03:38:00.000000000  3                  Credit          4.550000190734863   -73.98210144042969  40.736289978027344  nan          nan                   -73.95584869384766   40.768028259277344  12.100000381469727  0.5          nan        2.0                 0.0             14.600000381469727  0.13698630034923553  3              6                     1\n",
       "2              DDS          2009-01-01 20:52:58.000000000  2009-01-01 21:14:00.000000000  1                  CREDIT          5.0                 -73.9742660522461   40.79095458984375   nan          nan                   -73.9965591430664    40.731849670410156  14.899999618530273  0.5          nan        3.049999952316284   0.0             18.450000762939453  0.16531164944171906  20             3                     0\n",
       "3              DDS          2009-01-24 16:18:23.000000000  2009-01-24 16:24:56.000000000  1                  CASH            0.4000000059604645  -74.00157928466797  40.719383239746094  nan          nan                   -74.00837707519531   40.7203483581543    3.700000047683716   0.0          nan        0.0                 0.0             3.700000047683716   0.0                  16             5                     1\n",
       "4              DDS          2009-01-16 22:35:59.000000000  2009-01-16 22:43:35.000000000  2                  CASH            1.2000000476837158  -73.98980712890625  40.73500442504883   nan          nan                   -73.98502349853516   40.72449493408203   6.099999904632568   0.5          nan        0.0                 0.0             6.599999904632568   0.0                  22             4                     0\n",
       "...            ...          ...                            ...                            ...                ...             ...                 ...                 ...                 ...          ...                   ...                  ...                 ...                 ...          ...        ...                 ...             ...                 ...                  ...            ...                   ...\n",
       "1,083,167,545  VTS          2015-12-31 23:59:56.000000000  2016-01-01 00:08:18.000000000  5                  1               1.2000000476837158  -73.99381256103516  40.72087097167969   1.0          0.0                   -73.98621368408203   40.722469329833984  7.5                 0.5          0.5        1.7599999904632568  0.0             10.5600004196167    0.1666666567325592   23             3                     0\n",
       "1,083,167,546  CMT          2015-12-31 23:59:58.000000000  2016-01-01 00:05:19.000000000  2                  2               2.0                 -73.96527099609375  40.76028060913086   1.0          0.0                   -73.93951416015625   40.75238800048828   7.5                 0.5          0.5        0.0                 0.0             8.800000190734863   0.0                  23             3                     0\n",
       "1,083,167,547  CMT          2015-12-31 23:59:59.000000000  2016-01-01 00:12:55.000000000  2                  2               3.799999952316284   -73.98729705810547  40.739078521728516  1.0          0.0                   -73.9886703491211    40.69329833984375   13.5                0.5          0.5        0.0                 0.0             14.800000190734863  0.0                  23             3                     0\n",
       "1,083,167,548  VTS          2015-12-31 23:59:59.000000000  2016-01-01 00:10:26.000000000  1                  2               1.9600000381469727  -73.99755859375     40.72569274902344   1.0          0.0                   -74.01712036132812   40.705322265625     8.5                 0.5          0.5        0.0                 0.0             9.800000190734863   0.0                  23             3                     0\n",
       "1,083,167,549  VTS          2015-12-31 23:59:59.000000000  2016-01-01 00:21:30.000000000  1                  1               1.059999942779541   -73.9843978881836   40.76725769042969   1.0          0.0                   -73.99098205566406   40.76057052612305   13.5                0.5          0.5        2.9600000381469727  0.0             17.760000228881836  0.1666666716337204   23             3                     0"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Daily activities\n",
    "df['pickup_hour'] = df.pickup_datetime.dt.hour\n",
    "df['pickup_day_of_week'] = df.pickup_datetime.dt.dayofweek\n",
    "df['pickup_is_weekend'] = (df.pickup_day_of_week>=5).astype('int')\n",
    "\n",
    "# Treat as a categorical feature\n",
    "df.categorize(column='pickup_hour', inplace=True)\n",
    "\n",
    "weekday_names_list = ['Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat', 'Sun']\n",
    "df.categorize(column='pickup_day_of_week', labels=weekday_names_list, inplace=True)\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:12:33.838161Z",
     "start_time": "2020-06-10T17:12:28.936605Z"
    }
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAA8sAAAFgCAYAAACMteurAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+j8jraAAAgAElEQVR4nO3deZxkdX3v/9e7exiGfXEQZRM0BDSoqKO4gyYkmLjENeASlySjj2hu8ktM1JtcNfFeozcac40LPzRIXK6o4IKIiBuIKwyoCCiKqEBAYQRU1ln6c/+o01Jd09Pd1XTVqZp+PR+Pekydpeq8q6b7dH3qu5xUFZIkSZIk6U4TbQeQJEmSJGnUWCxLkiRJktTDYlmSJEmSpB4Wy5IkSZIk9bBYliRJkiSph8WyJEmSJEk9LJYlSZIkSXNKcmKS65JcvIB935LkW83t+0luGkbGpRavsyxJkiRJmkuSxwI3A++tqsP6eNxfAA+qqhcNLNyA2LIsSZIkSZpTVX0JuKF7XZL7JDkzyQVJzk1y6CwPPQ744FBCLrEVbQeQJEmSJI2lE4CXVNUPkhwBvAN4/PTGJPcCDgK+0FK+u8RiWZIkSZLUlyQ7A48EPpJkevX2PbsdC5xSVZuHmW2pWCxLkiRJkvo1AdxUVYfPsc+xwEuHlGfJOWZZkiRJktSXqvol8KMkzwRIxwOntyc5BNgD+FpLEe8yi2VJkiRJ0pySfJBO4XtIkquT/AnwHOBPknwbuAR4StdDjgNOrjG+/JKXjpIkSZIkqYcty5IkSZIk9RjrCb72WLmy9tlhVdsxWLHTaLyNK/bYqe0IHat2azsBAL/Y0DsZXzuuXX9z2xEAqKnR6EUysWKy7QgA7LzjyrYjALDnTtu1HYEdJm9vOwIA2XhL2xEAqI0b244wWu6c4bRVmRyNv7VMtv87C1CbN7QdAYDaOBo5Mjkaf1uyYjQ+ezAxGu8HGY3f2420Xy8AXPTNC9dX1V5t51gKh6/ao341tbi/l1dsvOUzVXXMEkcaiNH4CV6kfXZYxcmPOKLtGNztiLu1HQGA1U9r/70AyKF/0HYEAD555QFtRwDg9e8ZjTkNNt96R9sRANjx7qPxZcrDDx+Nn4/jHnqPtiNw+C7fbzsCAPnZurYjALDxp1e3HWGkZLvRKA5X7L667QgATOx2z7YjALD5xqvajgDAhmtG4/dlxW6j8bdlxd4Hth0BgOywZ9sRAJhaORq/t9dxn7YjALDPjjv8pO0MS+VXUxt5w95zTYK9dc+6+iuj8YOxAGNdLEuSJEmShivA5Gh0PBooi2VJkiRJ0sIlTExs+9WyxbIkSZIkqS8Ty2Cq6GXwEiVJkiRJ6o8ty5IkSZKkBXPMsiRJkiRJvYJjliVJkiRJ6haWx5hli2VJkiRJ0sLFYlmSJEmSpBkCTMRu2JIkSZIkzWDLsiRJkiRJ3ZbJBF/L4PsASZIkSZL6Y8uyJEmSJGnBAmQZNLtaLEuSJEmS+uKYZUmSJEmSui2TMcsWy5IkSZKkBbMbtiRJkiRJvZJl0bI8lO8Dkuya5JNJzk5yXpInDeO4kiRJkqSll4nF3eZ93uTEJNcluXgr25PkrUkuT3JRkgcv9WubNqzG8+cBZ1bVUcARwLlDOq4kSZIkaXycBBwzx/YnAAc3t7XAOwcVZFjF8q3Aw5LsXR03JVk3vTHJ15t/T0ryriSfS/KJJNt+274kSZIkjZHQmeBrMbf5VNWXgBvm2OUpwHubuvLrwO5J7rk0r2ymYRXL7wMuAz6T5KtJDp5j33Or6neAm4H7925MsjbJuiTrbtywcUBxJUmSJEmzyl3qhr16up5rbmv7PPq+wFVdy1c365bcUCb4qqpNwOuB1yd5HPBPPbt0f8Xwzebfq4A9ZnmuE4ATAH5rt11r6dNKkiRJkuaSxU/wtb6q1tyVQ8+ybiB14VCK5ST3Aq6tqg3AdXRatFclmQT2A1Z37d79Qu2GLUmSJEmjJDDR3qWjrgb271reD7hmEAca1qWj7g98KMntzfLLgCcBXwMuZO4+6ZIkSZKkEREgk621a54GvCzJyXQmj/5FVV07iAMNqxv26cDpPasvBv65Z78XdN1/5eCTSZIkSZL6koVdBmpRT518EDiKztjmq4HXANsBVNXxwBnA7wOX05lI+oWDSTK8lmVJkiRJkuZUVcfNs72Alw4ji8WyJEmSJKkvd2GCr7FhsSxJkiRJ6kMsliVJkiRJmmGAY5ZHicWyJEmSJGnBgt2wJUmSJEmaKZDJtkMMnsWyJEmSJKkvtixLkiRJktQty6NYXgbDsiVJkiRJ6o8ty5IkSZKkvjgbtiRJkiRJXRLI5LbfDdtiWZIkSZLUhyyLMcsWy5IkSZKkhYvdsCVJkiRJ2pIty5IkSZIkdVkmY5aXQeO5JEmSJEn9GeuW5YkVYdXqlW3HYMf77tV2BABW3O0ebUcA4CZGI8eVN9zedgQApjZsajsCAJM7bt92BAAOOOBubUcA4KhDVrcdAYDf2u1nbUdg4qbL244AwMbrr207AgB3XHNd2xEAqNs2th0BgMk9dmo7AgAbr1vfdgQAdvjNthN03Prd77UdAYDrP3pZ2xEA+Pllt7QdAYC9HrBL2xEAmNh+NNrDbrzkV21HAOCm60bjfLqtcYIvSZIkSZK6BCf4kiRJkiRppgDLYMyyxbIkSZIkqQ9eZ1mSJEmSpJnimGVJkiRJkrawHIrlZTAsW5IkSZKk/tiyLEmSJElaOCf4kiRJkiRpS146SpIkSZKkLomzYUuSJEmStCW7YUuSJEmS1MVLR0mSJEmSNAuLZUmSJEmSZloOLcvLYA4zSZIkSZL6Y8uyJEmSJGnhEuIEX5IkSZIk9VgG3bAtliVJkiRJC7dMZsMe6pjlJLsm+WSSs5Ocl+RJW9lv7TBzSZIkSZL6MJnF3cbIsFuWnwecWVVvTxJgt63stxY4YXixJEmSJEkLEVuWB+JW4GFJ9q6Om5K8v2lp/nKSA5I8FTikWfdHQ84nSZIkSZpTOmOWF3MbI8NuWX4fcE/gM0luBZ4PrK2qW5M8GXhxVf19ksuq6qjZnqDpor0WYJ8dVw0ptiRJkiRpORlqsVxVm4DXA69P8jjgn4D1SQ4HtgcuWcBznEDTRfv+d9utBhhXkiRJktTLbthLL8m9kqxsFq8DVgN7V9VjgP8JTL/jFsGSJEmSNKrshr3k7g98KMntzfJfAm9L8lng0q79vpjkNOBdVfXJIWeUJEmSJG1NIJPDnv5q+IbdDft04PSe1Y+ZZb+/G04iSZIkSVJ/xq+VeDGG3bIsSZIkSRpzy2HMssWyJEmSJGnhwrJoWd72O5pLkiRJktQnW5YlSZIkSX2xG7YkSZIkSV2SOBu2JEmSJElbsGVZkiRJkqQusRu2JEmSJElbsliWJEmSJKlLQia2/THL2/4rlCRJkiQtrYks7jaPJMckuSzJ5UleOcv23ZJ8Msm3k1yS5IUDeX1YLEuSJEmSRkCSSeDtwBOA+wHHJblfz24vBS6tqgcCRwFvTrJyEHnshi1JkiRJ6s9gxiw/DLi8qq4ASHIy8BTg0q59CtglSYCdgRuATYMIY7EsSZIkSVq4MKgxy/sCV3UtXw0c0bPP24DTgGuAXYA/qqqpQYSxG7YkSZIkqQ+LHK/caY1enWRd123tzCfeQvUs/x7wLWAf4HDgbUl2HcSrtGVZkiRJkrRguWsty+uras1Wtl0N7N+1vB+dFuRuLwTeUFUFXJ7kR8ChwHmLDbQ1tixLkiRJkvozmNmwzwcOTnJQM2nXsXS6XHe7EvhtgCR7A4cAVyzxqwPGvGV55T57cNA/Pr3tGFxzj+e0HQGAky6+re0IAJx12o/bjgDAz65c33aEkfLK5z+i7QgAPG3nT7QdAYCff/wNbUcA4PZdtm87AjnkgLYjdAxmopC+bbfHQHpy9W1ivx3ajgDAxvU3th0BgJX77dt2BACy92FtRwBg8sc/bDsCALdee3vbEQD4yZUb2o7Q+FXbAQBYtfNk2xEAuP3mzW1HAGDVjrYPDkIG8He7qjYleRnwGWASOLGqLknykmb78cDrgJOSfIdOt+1XVNVAPviPdbEsSZIkSdp2VNUZwBk9647vun8N8LvDyGKxLEmSJElauAQGMxv2SLFYliRJkiT1ZRDdsEeNxbIkSZIkaeGCLcuSJEmSJM0UW5YlSZIkSdqCLcuSJEmSJHUJJNt+sbztv0JJkiRJkvpky7IkSZIkqQ8BxyxLkiRJktTDMcuSJEmSJN0p8TrLkiRJkiT1CCyDCb4sliVJkiRJfYndsCVJkiRJ6hKWxQRfC/46IMkTZln3kqWNI0mSJElS+/ppO/8fSR4/vZDkFcBTlj6SJEmSJGl0NWOWF3MbI/10w34ycHqSvwWOAQ5t1kmSJEmSlhHHLHepqvVJngx8DrgAeEZV1cCSSZIkSZJGT/A6ywBJfgV0F8UrgXsDz0hSVbXrYg+e5EDgfOCSZtVrquqcZtsLgMuq6muLfX5JkiRJ0lILybY/wde8xXJV7TLgDOdU1TO6VySZqKqTBnxcSZIkSdJi2LJ8p3S+OngOcFBVvS7J/sA9q+q8pQqT5FJgHXB906K9rqpO79lnLbAW4IB97rZUh5YkSZIkLUQYu8m6FqOfV/gO4BHAs5vlm4G3L0GGI5OcneRs4L7AX1bV32xt56o6oarWVNWa1XvuvASHlyRJkiQtXMjExKJu46Sf2bCPqKoHJ/kmQFXdmGTlEmT4dTfsJBdW1Y1L8JySJEmSJC1aP8XyxiSTNJN9JdkLmFriPEv9fJIkSZKkpbYMumH3Uyy/FfgYsHeS/wU8A/iHgaSSJEmSJI2mxAm+ulXVB5JcAPw2nSHdf1hV370rB6+qH9MpuqeX13Tdf+1deW5JkiRJ0tILeOmoWawGbq2q9yTZK8lBVfWjQQSTJEmSJI0oW5bvlOQ1wBrgEOA9wHbA+4FHDSaaJEmSJGn0xDHLPZ4KPAi4EKCqrkmyy0BSSZIkSZJGk9dZ3sKGqirunA17p8FEkiRJkiSpXf20LH84yf8P7J7kz4AXAe8aTCxJkiRJ0mgKcczynarqTUmOBn5JZ9zyq6vqswNLJkmSJEkaTcugG3Y/E3y9CDi3qv52gHkkSZIkSaPOYnmGA4HnJrkXcAFwLp3i+VuDCCZJkiRJGkFxNuwZqurVAEl2AP4M+Fvg34DJwUSTJEmSJI0ki+U7JfkHOtdU3hn4JvByOq3LkiRJkqRlwwm+ej0N2AR8CjgH+HpV3T6QVJIkSZIktWjBXwdU1YOB3wbOA44GvpPky4MKJkmSJEkaUcnibmOkn27YhwGPAY4E1gBXYTdsSZIkSVpegmOWe7yRTvfrtwLnV9XGwUSSJEmSJI0uZ8Oeoar+YK7tSU6tqqff9UiSJEmSpJFmsdyXey/hcy1Irdyd2/d/8rAPu4VPXnRH2xEAOOuCK9uOAMC137+m7QgATPzylrYjAFA779B2BAB2XTUaV3nbtH40fj42XDsiPx+bptqOwPa3jcZcjdl+ZdsRANg8Iu8HE6Mxrqs2bW47AgC1YTT+1vYx3ctAjcostNX+KQyAEXk7uPXm0fh92XD7aPzH7LznUpYai7fXw+/WdoSOb7QdYIlZLPellvC5JEmSJEkjyW7YkiRJkiTNtEwm+FrKVzga/cUkSZIkSbqL5i2Wk3y++feN8+z6iiVJJEmSJEkaYaFTSi7m1p4kOyVZ8EQ+C+mGfc8kRwJPTnIyPS3IVXVh8+9ZfSWVJEmSJI2nMeiGnWQCOBZ4DvBQ4A5g+yTXA2cAJ1TVD7b2+IUUy68GXgnsB/xrz7YCHr+I3JIkSZKkcTUGxTLwReBzwKuAi6s6c/gn2RN4HPCGJB+rqvfP9uB5i+WqOgU4Jcn/qKrXLV1uSZIkSdL4GZvZsH+nqjb2rqyqG4BTgVOTbLe1By/4FVbV65I8OcmbmtsTF5dXkiRJkjS2pmfDXsxtiKpqY5JnAyQ5dmv7bO3xC750VJJ/Bh4GfKBZ9ZdJHlVVr+ojryRJkiRprE1P8DUW9k3yLDrDivvSzyv8A+Doqjqxqk4EjmnWSZIkSZI0UpK8BtgT+L/Ankle3c/j+/06YPeu+7v1+VhJkiRJ0rZgQN2wkxyT5LIklyd55Vb2OSrJt5JckuScrT1XVf0jcAPwXOCGqvqnfl7igrthA/8MfDPJF+m0uz+WzqxikiRJkqTlZADjj5trIL8dOBq4Gjg/yWlVdWnXPrsD7wCOqaork9x9nqe9pqpOTnJcv3kWXCxX1QeTnE3n+lQBXlFVP+0K/VtVdUm/ASRJkiRJ4yQUGcQTPwy4vKquAEhyMvAU4NKufZ4NfLSqrgSoquu2mjLZuao+0Oz3wTn2uXm2bX19HVBV11bVaVX1ie5CufG+fp5LkiRJkjSmFt8Ne3WSdV23tV3Pui9wVdfy1c26br8J7JHk7CQXJPnjOVJ+Ismbkzw2yU6/jp7cO8mfJPkMnbm4ZtVPN+z5DOSrBUmSJEnSqFl0N+z1VbVmK9tmqymrZ3kF8BDgt4EdgK8l+XpVfX+LB1b9dpLfB14MPCrJHsAm4DLgDOD5szQCzzjQUul9EZIkSZKkbU4Gdc3kq4H9u5b3A66ZZZ/1VXULcEuSLwEPBLYolgGq6gw6hXHfxubiWJIkSZKkbdr5wMFJDkqyEjgWOK1nn08Aj0myIsmOwBHAd+d60iSfX8i6XkvZsrxhCZ9LkiRJkjSiagDtrlW1KcnLgM8Ak8CJVXVJkpc024+vqu8mORO4CJgC3l1VF8/2fElWATvSGSe9B3d2894V2Ge+PAsulpOcCpwIfLqqpmZ5YQ+f47ErgbOaxYcAFzT311XVyxeaQZIkSZLUsgAZzJRVs3Wbrqrje5b/BfiXBTzdi4G/olMYX8CdxfIv6Vyiak79tCy/E3gh8NYkHwFOqqrvLeSBVbUBOAogybqqOirJUcAT+zi+JEmSJKl1YRxG9FbV/wH+T5K/qKp/7/fx/Vxn+XPA55LsBhwHfDbJVcC7gPdX1cZ+Dw4cluQTwIHAc6vqO00xvQagmdVsqy3WkiRJkqThKqAGM8HXQFTVvyd5JJ26c0XX+vfO9bi+xiwnuRvwXOB5wDeBDwCPBp5P03Lcp+2q6pgkR9Nptf7rBWRYC6wF2P+A/RZxSEmSJEnS4o1Hy/K0JO8D7gN8C9jcrC5gaYrlJB8FDgXeBzypqq5tNn0oybq+E3d8q/n3KmCP2Q7bu6KqTgBOAHjwQw73clWSJEmSNGQ16yWRR9Ya4H5V1Vf92E/L8tuq6guzbZjjotLz6Q47/W5vTrJrc//gRT6vJEmSJGlQxqgbNnAxcA/g2vl27NbPmOUvJDkMuB+wqmv9nE3Xi/A24EvAJWx5AWpJkiRJkvqxGrg0yXnAHdMrq+rJcz2on27Yr6EzLvl+dKbyfgLwZebp591ruhW6qs4Gzm7ufw94QXP/fXS6ekuSJEmSRtCYdcN+7WIe1E837GcADwS+WVUvTLI38O7FHFSSJEmSNK7Ga4KvqjpnMY/rp1i+raqmkmxqxhRfB9x7MQeVJEmSJI2vGqNiOcmvuHO+rJXAdsAtVbXr1h/VX7G8LsnudK6rfAFwM3DeIrJKkiRJksba+HTDrqpdupeT/CHwsPke188EX3/e3D0+yZnArlV1UV8pJUmSJEljLmPVstyrqj6e5JXz7TdvsZzkwXNtq6oL+w0nSZIkSRpPBVTGp2U5ydO6FifoXHd53msuL6Rl+c3Nv6uaJ/02nTb3BwDfAB7dV1JJkiRJkobnSV33NwE/Bp4y34PmLZar6nEASU4G1lbVd5rlw4CXLyapJEmSJGl8jdOlo6rqhYt5XD8dzQ+dLpSbA14MHL6Yg0qSJEmSxtX0paMWcxu+JPsl+ViS65L8LMmpSfab73H9pP1ukncnOSrJkUneBXx38ZElSZIkSeOoyKJuLXkPcBqwD7Av8Mlm3Zz6KZZfCFwC/CXwV8ClzTpJkiRJ0jJSlUXdWrJXVb2nqjY1t5OAveZ7UD+XjrodeEtz20KSU6vq6Qt9PkmSJEnSeBqzS0etT/Jc4IPN8nHAz+d70FK+wnsv4XNJkiRJkkbQYrtgt9gN+0XAs4CfAtcCz2ABvaQX3LK8APNep0qSJEmSpCF7HfD8qroRIMmewJvoFNFbtZTFsiRJkiRpGRinS0cBD5gulAGq6oYkD5rvQUtZLI/VuyVJkiRJWpwxK5YnkuzR07I8by3cV7GcZCVwKJ0u15dV1Yauza/o57kkSZIkSeNpzIrlNwNfTXIKnVr2WcD/mu9BCy6Wk/wBcDzwQzqtyAcleXFVfRqgqs5aTOq7YuPUJD+9fbdhH3YLV1x3XdsRALjphpvbjgBAbr297QgArLh9w/w7DcGGHVe1HWGk1ObNbUcAYGrDaOSojVNtR4AVk20nACATozGrZjIif/xH5P2Y2M4RWzPUprYTAFCbRiPH1NRoTFmz/Xaj8Xu7656j8fuycqfROK/vfK8d244AwB7H/FbbETrmLc3GR9HqZaD6VlXvTbIOeDydWvZpVXXpfI/r5zf6zcDjqupygCT3AT4FfHoReSVJkiRJY2rMWpZpiuN5C+Ru/RTL100Xyo0rgNFoUpUkSZIkDc24FcuL0U+xfEmSM4AP0+nn/Uzg/CRPA6iqjw4gnyRJkiRpxIxTN+zF6qdYXgX8DDiyWb4e2BN4Ep3i2WJZkiRJkrRNWHCxXFUvHGQQSZIkSdJ4sBt2lyTvodOCPENVvWhJE0mSJEmSRlYBI3A9j4Hrpxv26V33VwFPBa5Z2jiSJEmSpJFW43XpqMXqpxv2qd3LST4IfG7JE0mSJEmSRprdsOd2MHDAUgWRJEmSJI2+Aqa2GKC77elnzPKvmDlm+afAK5Y8kSRJkiRppNkNe6a9qur27hVJ9lziPJIkSZIktW6ij31PTfLr4jrJPYDPLn0kSZIkSdIom1rkbZz0Uyx/HDglyWSSA4GzgFcNIpQkSZIkaXRVMyN2v7dx0s9s2O9KspJO0Xwg8OKq+uqggkmSJEmSRo/XWW4k+evuRWB/4FvAw5M8vKr+dVDhJEmSJEmjZ9xaiRdjIS3Lu/Qsf2wr6yVJkiRJ27ry0lEAVNU/DiOIJEmSJGn0FVDLoFhe8ARfST6bZPeu5T2SfGaO/f8jyRHN/Zcn+Whzf0WSb89zrEOTnLTQbJIkSZIkLaV+ZsPeq6puml6oqhuBu8+x/9eBI5r7D+xa/wDgoj6OK0mSJEkaGWFqkbdx0k+xvDnJAdMLSe5FpwV+a74BPLy5vwr4YfP4I4DLk3w8yReSvL+5HNWKJKck+Rzw0j5fhyRJkiRpSKZqcbdxsuBLRwF/D3w5yTnN8mOBtXPsfzFwvyR7A9cC59EplI8A7gW8rqq+kORvgKc2j/l+Vf33JH8GPGq2J02ydvq4++x/wGy7SJIkSZIGpBi/wncxFtyyXFVnAg8GPgR8GHhIVW11zHJVTQE3AE+kUyhPF8v3B24D/jHJ2cCzgHsAvwFc0Dz8vDme94SqWlNVa/a82+qFxpckSZIkLYXqTPC1mNs4Wch1lg+tqu8leXCz6prm3wOSHFBVF87x8G8AfwE8s6p+kuSBwM3A94CPVdW5zTG2A54CPAg4FVizuJcjSZIkSRq05dCyvJBu2H9Np9vzm5k5RjnN8uPneOw3gBdX1Q+a5duBbwFvAN6VZPqyVH8HfBw4Nsnn6RTTkiRJkqQRs1y6YS/kOsvT45J/H/hz4NF03p9zgXfO89iP0ymCp5ef1LX5abM85Bnz5ZEkSZIktctieab/BH4JvLVZPg54L50xx5IkSZIkbTP6KZYPqaru6yV/Mcm3lzqQJEmSJGl0LZdu2P1cZ/mbSaavm0ySI4CvLH0kSZIkSdLIWuQ1lhdSYCc5JsllSS5P8so59ntoks1JBjaUt5+W5SOAP05yZbN8APDdJN8BqqoesOTpJEmSJEkjZxAty0kmgbcDRwNXA+cnOa2qLp1lvzcCW72U8VLop1g+ZmApJEmSJEljYYDdsB8GXF5VVwAkOZnOJYYv7dnvL+hccvihA0nRWHCxXFU/GWQQSZIkSdI4KKZq0dXy6iTrupZPqKoTmvv7Ald1bbuaTg/nX0uyL/BUOpcwHo1iWZIkSZKku9iyvL6q1mxlW7ZyuG7/BryiqjYns+2+dCyWJUmSJEmj4Gpg/67l/YBrevZZA5zcFMqrgd9PsqmqPr7UYSyWJUmSJEkLV7B5MGOWzwcOTnIQ8F/AscCzZxy66qDp+0lOAk4fRKEMFsuSJEmSpD4MaoKvqtqU5GV0ZrmeBE6sqkuSvKTZfvzSH3XrLJYlSZIkSX25CxN8zamqzgDO6Fk3a5FcVS8YSIiGxbIkSZIkqS9TU20nGDyLZUmSJEnSglXB5gG1LI8Si2VJkiRJUl8GMWZ51Ey0HUCSJEmSpFFjy7IkSZIkacE6s2Fv+03LY10s37oRvnXt5rZj8IOrb2w7AgC3rf9l2xEAWPXLW9qOAMCKTe3/bABsGJHZD0ZlXMnEqh3ajgDA5A6jcfqbGIEcE9uvbDsCMDo/GxMbN7YdAYDJVavajgDA1M23th0BgNq8qe0IAGRqQ9sRAJjaOBrvx9Sm0fjbsusuk21HAGDVbu2f0wG222U0cux48G5tRwBg1YEHtx1hm7R5ND7iDtRo/CZJkiRJksZClS3LkiRJkiRtYUQ6Tw6UxbIkSZIkacGKGpkhfoNksSxJkiRJ6ovdsCVJkiRJ6lK1PCb48jrLkiRJkiT1sGVZkiRJktQXu2FLkiRJktSlgM1TFsuSJEmSJN2pYBnUyhbLkiRJkqSFs2VZkiRJkqQeXmdZkiRJkqReBVPLoGXZS0dJkiRJktTDlmVJkiRJ0oI5ZlmSJEmSpFksg1rZYlmSJEmStHBVtixLkiRJktTD2bAlSZIkSZqhWB6zYQ+tWE6yEjirWXwIcEFz/4lVdfOwckiSJEmS7gK7YS+tqtoAHAWQZF1VHTWsY0uSJEmS1I/WrrOc5LVJntjcf0mSFzT3/3uSc5J8Kcn9Z3nc2iTrkqz75Y3XDzm1JEmSJC1vBWyuWtRtnLRWLM+mKY4PqaojgWcB/9S7T1WdUFVrqqkfEHkAAA/WSURBVGrNrnvsNfSMkiRJkrTcTU3Vom7jpM0JvrrfqTT/3hd4ZJKzm+XNQ00kSZIkSZpTVTlmecBuBPZv7j8E+DLwPeCcqvpTgCTbtZRNkiRJkrQVFsuDdQrwiWbc8m0AVXVRkh8kOQeYAj4LvL7FjJIkSZKkLgVMjdn448VopViuqjXN3YfOsu2NwBuHm0iSJEmStCBeOkqSJEmSpJkKxm6yrsUYqdmwJUmSJEkaBbYsS5IkSZL64GzYkiRJkiTNUI5ZliRJkiRpS8thzLLFsiRJkiRpwQrY7KWjJEmSJEnqUsXU1FTbKQbOYlmSJEmStGDF8hiz7KWjJEmSJEnqYcuyJEmSJGnhygm+JEmSJEmaYbl0w7ZYliRJkiT1xZZlSZIkSZK6VJXFsiRJkiRJvTZ76ShJkiRJku5Uy2SCLy8dJUmSJElSj7FuWd5cxa/u2Nx2jJH5ViUrJtuOAMDmEcmhma6/eWPbEQBYsf/9244AwG5H/rTtCABM7rRj2xFYecB9244AQCZG40/S5C4/bzsCANlu+7YjADCx405tR+iYHJG/LdX+5w6Aie1G4/dl9/u0fw4D2L3tAI1dHrxX2xEA2H6/PdqOAMAOv3mftiMAcNvdH992hG3SqNRAgzQaZ1pJkiRJ0lioKi8dJUmSJElSr+XQsuyYZUmSJEnSghUwVVOLus0nyTFJLktyeZJXzrL9OUkuam5fTfLAQbxGsGVZkiRJktSPAc2GnWQSeDtwNHA1cH6S06rq0q7dfgQcWVU3JnkCcAJwxJKHwWJZkiRJktSHogbVDfthwOVVdQVAkpOBpwC/Lpar6qtd+38d2G8QQcBiWZIkSZLUp6mp+btUb8XqJOu6lk+oqhOa+/sCV3Vtu5q5W43/BPj0YoPMx2JZkiRJkjQs66tqzVa2ZZZ1szZhJ3kcnWL50UsVrJfFsiRJkiRp4QY0ZplOS/L+Xcv7Adf07pTkAcC7gSdU1c8HEQQsliVJkiRJfSigBlMsnw8cnOQg4L+AY4Fnd++Q5ADgo8Dzqur7gwgxzWJZkiRJkrRwVXdlzPIcT1ubkrwM+AwwCZxYVZckeUmz/Xjg1cDdgHckAdg0R7fuu8RiWZIkSZLUlwF1w6aqzgDO6Fl3fNf9PwX+dCAH72GxLEmSJElasCqozYMplkeJxbIkSZIkqQ+D6YY9aibaDiBJkiRJ0qixZVmSJEmS1JcBzYY9UiyWJUmSJEkLVxbLkiRJkiTN0LnOsmOW77IkByapJI9rllcmubG5fpYkSZIkaZxUUVOLu42TYU3wtQ54WnP/d4AfDOm4kiRJkqQlVlNTi7qNk2EVyz8BDkgS4KnAxwCSvDzJ15J8NclDmnUXJnlnkm8kedWQ8kmSJEmSFsiW5aX1NeCxwF7AtcDuwJOBRwHPBd7Y7Lc78AbgEcCxvU+SZG2SdUnW3Xzj+mHkliRJkiQtM8Mslk8F3gKc3bXu21U1VVVXALs1626sqp9U1RRwW++TVNUJVbWmqtbsvMfqgYeWJEmSJHUpW5aXVFX9APgycErX6sOTTCS5N3DT9K7DyiRJkiRJ6lfB1NTibmNkqJeOqqr/BtAZusxNwCeAr9ApkP9imFkkSZIkSf0rr7O8NKrqx8Azetad1LX4pp5ta7ruP3yQ2SRJkiRJizBmrcSLMdSWZUmSJEnSuBu/8ceLYbEsSZIkSVq4wpZlSZIkSZK2sAxalod56ShJkiRJksaCLcuSJEmSpIWrInbDliRJkiSpxzLohm2xLEmSJElasIAty5IkSZIkzVDYsixJkiRJ0kyOWZYkSZIkaaZaHt2wvXSUJEmSJEk9bFmWJEmSJPVlwjHLkiRJkiTdKY5ZliRJkiSpR9myLEmSJEnSFmxZliRJkiSpS6qYWAbFcqrGt/k8yfXAT+7i06wG1i9BnLvKHDOZYyZzzGSOmcwxkzlmMsdMo5BjFDKAOXqZYyZzzLQUOe5VVXstRZi2JTmTznuyGOur6pilzDMoY10sL4Uk66pqjTnMYQ5zmMMc5jDHcslgDnOYYzxzaLi8zrIkSZIkST0sliVJkiRJ6mGxDCe0HaBhjpnMMZM5ZjLHTOaYyRwzmWOmUcgxChnAHL3MMZM5ZhqVHBqiZT9mWZIkSZKkXrYsS5IkSZLUw2JZkiRJkqQey7pYTvIvSc5N8oEkK1vMsUuSbyS5OclhLWV4SPNenJPkw0m2aynHYUm+0uT4VJKd28jRlee45nrebR3/wCTXJzm7ubV2bb4kRyX5fPN/85SWMjys6724LMlbWsoxkeQ/m9+Zc5Pcp6Uck8356+wkJw3z93a281aSP0ry1SRfSLJ/izne0/zevGwYGWbLkWSnJGcl+VKSLyY5sI0czbovNT8jXx3W35it/V1LckCSO9rMkeQHXeeRo1vMsV+S05ocr2kjR5KVXe/FN5J8s40czbqXJTmvuT2pxRx/0/yunJVknyHl2OIzWEvn09lyDPV8OkuG3Vo6l872Xgz9XKoRUFXL8gY8CHh/c//vgWe3mGUFsBdwEnBYSxnuAezY3H898MyWcmzXdf81wPNa/H+ZAE4FLmwxw4HAKW0dvyvHKuCTwMq2s3RlejdwZEvHfjBwcnP/aOAtLeV4JvC65v7fAX80xGPPOG8B2wHfAFYCjwJOaCNHs+6ewAuAl7X4fmwP7Nts+13g7S2+H9s1/x4J/EdbOZr1bwe+MKy/dVt5P9YN6+dinhwfnP4ZaTNH17bnAq9p8f24tFm/K/C1NnLQ+Sz0BSDAw4B3DinHFp/BWjqfzpZjqOfTWTL8UUvn0tnei6GfS721f1vOLcuPAM5q7p8JPLKtIFW1qapaa71sMvy0qm5tFjcCm1rKsbFrcUfge23kaDwbOAWYajEDwKOabzdfnyQtZXgkcBvwySQfS3KPlnIAkGQF8HDg3JYiXN3kCLA70Nbv772BbzX3LwQeM6wDz3LeOhi4pKo2VNVXgPu3lIOqunYYx54rR1XdUVX/1SwO7Zy6lfdj+ry6K/CdtnIkOQgo4MphZNhaDmDnprXo/ybZs40c6fQCORB4c9NyOJTPIPN83ngm8JEWc1wO7ADsAvy8pRz3onMeKzrn1EcPKUfvZ7DfpJ3z6RafBYd9Pp0lw4aWzqWzvRdDP5eqfcu5WN4d+GVz/xfAUP5gjrokBwC/A5zeYoajm65gjwN+2FKGSeBZwIfaOH6Xa4HfAB4L3B14aks59gYOAp5E59IJr20px7THA+dUVVtfZKyn8yXKd4H/TadVog3fpfNeQOf3dveWcsDMcyrAZFtBRklTFL0aeGuLGfZK8hXgHcCX2soBvAJ4U4vHn/aoqjqSzhflr20pw2rgAcDL6Xwx+28t5QA63ZGB/avq0hZjnEmndfl82vt9+SHw0CTb0zmn7jHMg3d9BvsyLZ5PR+Sz4IwMbZ1Lu3OM0LlUQ7Sci+Ub6XwzBJ0PeTe0mGUkJNkVeB/wwp4W3qGqqs9W1YPotOqubSnGc4EPt1iMAb9unbql+Zb7VODwlqLcBHy5qjbQ6aJ2v5ZyTBtaC8hW/B5wW1UdCjwd+NeWcpwObEjyBTo9MX7aUg6YeU4F2NxWkBFzAnB8VbXyxR9AVV1fVY+i87P6+jYypBnXX1U/buP43apqutXyI7R7Tv1+VV1dVT8FNjU9ZtryZOC0tg7efP5YS6eHyqFAKz2pqmo9cDydnoe/B1w2rGN3fwYDrqOl8+kofBbcSoahn0t7c4zCuVTDt5yL5a/TGfsAnRPiV1rM0rqmJfUDwD9V1fdbzLF91+IvgFtainI/4I+TnAkcnPYmktqla/GxdLqpteE87iyQHwRc0VKO6S7Yj6D9b3VvbP69iZZadKtqqqr+v6p6PJ0v/D7eRo7G5cD9mgmDHgVc1GKWkZDkH4AfVVVrPVSSrEgy/be+zXPqA4Hfas6pRwPHp4WJJJufz+m/M62dU6vqNuCmZvKinejMB9HK8KdG219ATgG3A3cAt9IZ89/KsKOqOqnpefAJOl8OD9wsn8FaOZ+OwmfB2TK0cS7tzTFC51INWToNVstTkn+hM+7xSjrfGm1oMcsZdL7h/gmdCSXeO+TjHwe8jTvHYLyzjQ94SZ4I/C2dP5zXAy/oGjPSiiTrqmpNS8d+AvA/6Xx4+BHworY+UCV5KZ2JNqaaHK0UzOnMXvvUqvrzNo7fZJik823zvnQ+1P11VX21hRz3AE6mM4brc1X1hiEff8Z5i84H3b+i86H3j6vqqpZy3JdOS9kk8Kmq+puWcvwHd34R+7WqelULOY4HXkTn93YKeGlVDWUuiK39XUtyEvCmqrq4hRzHA/+NzgfdO+icy9r6Ob0ceCOdyfH+sao+3VKOj9EZ1vLgYRx/jhx3B55B5/f2P6rq+JZy/D6dSb9+Quf35bYhZNjiM1jz71DPp1vJcThDPJ/OkuE9dCb0HOq5dCs5WjmXql3LuliWJEmSJGk2y7kbtiRJkiRJs7JYliRJkiSph8WyJEmSJEk9LJYlSZIkSephsSxJkiRJUg+LZUmSJEmSelgsS5LGTpJ3J7nfHNtfm+TlAzr2C5K8bRDPLUmSRseKtgNIktSvqvrTtjMstSQrqmpT2zkkSVKHLcuSpJGV5MAk30vyn0kuSnJKkh2TnJ1kTbPPMUkuTPLtJJ+f5Tn+LMmnk+yQ5Oau9c9IclJz/6Qkxyc5N8n3kzxxnmj7JDkzyQ+S/O+u5zwuyXeSXJzkjV3r5zruvyb5IvBGJEnSyLBlWZI06g4B/qSqvpLkRODPpzck2Qt4F/DYqvpRkj27H5jkZcDvAn9YVXckmes4BwJHAvcBvpjkN6rq9q3sezjwIOAO4LIk/w5splPwPgS4ETgryR9W1cfneX2/CfxOVW2eZz9JkjREtixLkkbdVVX1leb++4FHd217OPClqvoRQFXd0LXtecATgKdX1R0LOM6Hq2qqqn4AXAEcOse+n6+qXzTF9KXAvYCHAmdX1fVNd+oPAI9dwHE/YqEsSdLosViWJI26mmM5s2yfdjGd1uL9tvLYVX0cp1d38b2ZTk+tuZqt5zruLXM8TpIktcRiWZI06g5I8ojm/nHAl7u2fQ04MslBAD3dsL8JvBg4Lck+zbqfJblvkgngqT3HeWaSiST3Ae4NXNZnzm80WVYnmWyynrOA40qSpBFksSxJGnXfBZ6f5CJgT+Cd0xuq6npgLfDRJN8GPtT9wKr6MvBy4FNJVgOvBE4HvgBc23Ocy+gUt58GXjLHeOVZVdW1wKuALwLfBi6sqk80m+c6riRJGkGpmquXmSRJ7UlyIHB6VR024OOc1BznlEEeR5IkjQ9bliVJkiRJ6mHLsiRJs0jye2x57eMfVZVjjiVJWgYsliVJkiRJ6mE3bEmSJEmSelgsS5IkSZLUw2JZkiRJkqQeFsuSJEmSJPX4f/qrSh5TwkmzAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 1080x360 with 2 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Number of pick-ups per hour for a given day of the week\n",
    "df.plot('pickup_hour', 'pickup_day_of_week', colorbar=True, colormap=cm_plusmin, figsize=(15, 5))\n",
    "\n",
    "plt.xticks(np.arange(24), np.arange(24))\n",
    "plt.yticks(np.arange(7), weekday_names_list)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:13:09.412001Z",
     "start_time": "2020-06-10T17:13:04.235866Z"
    }
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAA80AAAFgCAYAAACBq5vsAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+j8jraAAAgAElEQVR4nO3deZhkZX33//enh4FBAVFAJcCAGqPiwuIoJBjBLS5xCS4xGjFiDPqLxiWaGDWJGp9HJUbNYzTyG/URt8So4IYBJcoqAg77Mi7EPRBlws7ADDP9ff6o01rT9FLV01Wnavr9uq5zdZ1TZ/lUTfeZ+tZ9n/ukqpAkSZIkSXc20XYASZIkSZJGlUWzJEmSJEmzsGiWJEmSJGkWFs2SJEmSJM3ColmSJEmSpFls13aArbHb7rvVyn1Xth2DZbWx7Qgdk6ORo+7Y0HYEADavv63tCABM3rap7QgAbLfrXdqO0HHX3dtOAMCtd4zG6W/9HZNtR+COTe1nALhj82jk8K4SW0rSdgQAlk2Yo9vEiPy7bB6Rv5dReT92XD4a7VErlo/G+7HDxOa2IwAwsfnWtiMAcMElV62rqj3azrEYDlxx97p58o4FbfuDO279alU9aZEjDdRofGpcoJX7ruS0c05rOwY7T/5X2xEAyM0/aDsCAJt+Pho5brlgbdsRALj1smvbjgDAbkce0HaEjkce03YCAM79+W5tRwDgop/d0nYErr5+fdsRALj2xtvbjgDAhjtG44uuUbF8u9EoAu521x3ajgDAzjsubzsCAHfdYTQ+wt1028I+NC+2UXk/HrTnzm1HAODB9xqN39P97np92xEAuOuN57cdAYCJez71x21nWCw3T97BO+914IK2/f2ffXM0WlD6MBpnGEmSJEnSWAiwbDQ6NAyFRbMkSZIkqXcJEyNyucowjEafK0mSJEmSRpAtzZIkSZKkvkwsoeZXi2ZJkiRJUs+8plmSJEmSpNmEJXVNs0WzJEmSJKlnwe7ZkiRJkiTNLEuraF5CL1WSJEmSpP7Y0ixJkiRJ6lmAiSyda5ptaZYkSZIk9WViYmHTfJKsSHJ+kkuSXJHkrXOs+4gkm5M8ezFf23S2NEuSJEmSejfY0bM3AI+tqluSLAfOTnJyVZ27RYRkGXAs8NVBBZli0SxJkiRJ6lmADKjPclUVcEszu7yZaoZV/ww4AXjEYJL8it2zJUmSJEl9GVT3bOi0Iie5GPgFcGpVnTft+b2AI4HjFvt1zcSiWZIkSZI0LLsnWdM1HTN9haraXFUHAnsDj0zykGmr/CPw+qraPIzAds+WJEmSJPVu665pXldVq3pZsapuSHI68CTg8q6nVgGfTmcE792BpyTZVFVfWGiouVg0S5IkSZJ6NshrmpPsAdzRFMw7Ao+nM+DXL1XVfbrWPx44aVAFM1g0S5IkSZL6kQxy9Ow9gY81o2NPAJ+pqpOSvAygqoZyHXO3oRTNSXYBPgXsDNwFeFtVfXkYx5YkSZIkLa4Bjp59KXDQDMtnLJar6kWDSfIrw2ppPgo4pao+kE7H87sN6biSJEmSpEUUBnqf5pEzrNGz19MZ9exe1XFDkjVTTyY5t/l5fJIPJfmPJF9sCuwtJDlmaqS1ddeuG1J8SZIkSdJSNKyi+RPAd4GvJjknyf3nWPesqno8nRtaP3T6k1W1uqpWVdWq3ffYfUBxJUmSJEkzSqd79kKmcTSU7tlVtQl4O/D2JI8B/m7aKt0tyhc1P38K3H0I8SRJkiRJfcgS6p49rIHA9gWuqaqNwC/otHCvaEZE25vOvbWmVPemw8gnSZIkSepRYGJMW40XYlgDgT0U+LcktzfzrwCeBnwLuBC4bkg5JEmSJElbIUCWLZ32zWF1zz4JOGna4suBd0xb70Vdj/9q8MkkSZIkSX3J+F6fvBBL6KVKkiRJktSfYXXPliRJkiRtIxwITJIkSZKkGcWiWZIkSZKkGS2xa5otmiVJkiRJPQt2z5YkSZIkaWaBLGs7xPAsoUZ1SZIkSZL6Y0uzJEmSJKkvds+WJEmSJGkmsWiWJEmSJGlWjp4tSZIkSdIMEsgyW5olSZIkSZpBllT37CXUqC5JkiRJUn9saZYkSZIk9S5e0yxJkiRJ0uyWUPdsi2ZJkiRJUu8cCGx8hEl2yK1tx4C6o+0EAGRiNP45s2w0cmy+dWPbETpG5ISy4jcOajsCAN+/bfe2IwDww+tubzsCAJf/5Pq2I/Dz/7ml7QgA3H77aJxLb1+/oe0IAGzauKntCAAsX7G87QgATEyMRj/AZduNRo5dd71L2xEAmJystiMAsGnTZNsRALj6uvVtRwDg8v/ase0IAOz/a7u0HQGAA+79xLYjbJOW0kBgo1HdSJIkSZLGQlha1zQvoZcqSZIkSVJ/bGmWJEmSJPUujMwliMNg0SxJkiRJ6kO8plmSJEmSpBnFgcAkSZIkSZqVRbMkSZIkSTNZYtc0O3q2JEmSJEmzsGiWJEmSJPUlEwub5t1vsiLJ+UkuSXJFkrfOsM4fJrm0mc5JcsAgXuMUu2dLkiRJknqWDHT07A3AY6vqliTLgbOTnFxV53at80Pg8Kq6PsmTgdXAIYMKZNEsSZIkSerPgK5prqoCbmlmlzdTTVvnnK7Zc4G9BxKmYdEsSZIkSerd1t1yavcka7rmV1fV6i12nywDLgB+HfhAVZ03x/7+GDh5oWF6YdEsSZIkSerPwovmdVW1aq4VqmozcGCSXYHPJ3lIVV0+fb0kj6FTND9qoWF64UBgkiRJkqSRU1U3AKcDT5r+XJKHAR8GnlFV/zPIHBbNkiRJkqS+ZCILmubdb7JH08JMkh2BxwPfmbbOSuBE4Kiq+t4AXt4W7J4tSZIkSepdQgY0EBiwJ/Cx5rrmCeAzVXVSkpcBVNVxwN8CuwH/nARg03xdvreGRbMkSZIkqT8DuuVUVV0KHDTD8uO6Hr8EeMlAAszAolmSJEmS1LutGz177Fg0S5IkSZL6M7ju2SNnqAOBJdklyZeTnJ7k/CRPm2W9Y4aZS5IkSZKkmQy7pfko4JSq+kA6V2zfbZb1jgFWz/KcJEmSJKklWWLds4d9y6n1wCOT3Ks6bkjyyabl+ewkK5McCTygWfbc6TtIckySNUnWrFs30NtxSZIkSZLuJJ2BwBYyjaFhtzR/gs4Q4l9Nsh74I+CYqlqf5OnAS6vqTUm+W1VHzLSDqlpN0wp98MMPqCHlliRJkiSBA4ENUlVtAt4OvD3JY4C/A9YlORDYAbhimHkkSZIkSQtg0TwYSfYFrqmqjcAvgN2BZVX1201L8zObVW1BliRJkqRRFMiyYV/p255hv9KHAmcmOR34APA6YM8kpwKP61rvtCRfmm10bUmSJEmShmHY3bNPAk6atvi3Z1jvL4eTSJIkSZLUn/Ed1Gshhj0QmCRJkiRpzDkQmCRJkiRJMwm2NEuSJEmSNBtbmiVJkiRJmkESR8+WJEmSJEm2NEuSJEmS+mX3bEmSJEmSZhCvaZYkSZIkaXYWzZIkSZIkzSAhE0tneCyLZkmSJElSf5ZQS/PS+XpAkiRJkqQ+2dIsSZIkSerPEmpptmiWJEmSJPUueE2zJEmSJEkziy3NkiRJkiTNJLY0S5IkSZI0B1uax8NEbWSHjT9tOwbZeHPbEQCYXH9d2xEA2HzzTW1HAOCWi9e1HQGAff7iqW1HAODE65/QdgQATr60/b9ZgCvXXt12BABuv67980e2W9Z2hI7JybYTAFC3b2w7AgBZsX3bEQDYPCLvByPSorHdiuVtRxgpmzeNxt/tjncZjb+Xu911NHLcd4+d2o4AwL53H42/l3usuKPtCBpzY100S5IkSZKGL7Y0S5IkSZI0g2RkegANg0WzJEmSJKkvtjRLkiRJkjSTsKRampfOK5UkSZIkLYKQiYVN8+45WZHk/CSXJLkiyVtnWCdJ3pfkqiSXJjl4IC+zYUuzJEmSJGlUbAAeW1W3JFkOnJ3k5Ko6t2udJwP3b6ZDgA82PwfColmSJEmS1J8Bdc+uqgJuaWaXN1NNW+0ZwMebdc9NsmuSPavqmkFksnu2JEmSJKl3gWRiQROwe5I1XdMxd9p9sizJxcAvgFOr6rxpq+wF/LRr/mfNsoGwpVmSJEmS1IfAwkfPXldVq+Zaoao2Awcm2RX4fJKHVNXlWwa482YLDTQfi2ZJkiRJUn+GMHp2Vd2Q5HTgSUB30fwzYJ+u+b2BqweVw+7ZkiRJkqSeJQxy9Ow9mhZmkuwIPB74zrTVvgS8sBlF+1DgxkFdzwy2NEuSJEmSRseewMeSLKPTyPuZqjopycsAquo44N+BpwBXAeuBowcZyKJZkiRJktSHQAY2evalwEEzLD+u63EBLx9IgBlYNEuSJEmS+pIhXNM8KiyaJUmSJEm9C1szevbYsWiWJEmSJPVhcN2zR5FFsyRJkiSpL0upe3bPrzTJk2dY9rLFjSNJkiRJ0ujop6X5b5JsqKpvACR5PXAEcNycW0mSJEmSth0BxqilOcnewB8Avw38GnAbcDnwFeDkqpqca/t+iuanAycl+QvgScADm2ULlmQ/4NvAFc2iN1fVGc1zLwK+W1Xf2ppjSJIkSZIWU0jGYyCwJB8F9gJOAo4FfgGsAH6DTl37piR/VVVnzraPnovmqlqX5OnAfwAXAM9u7o+1tc6oqmd3L0gyUVXHL8K+JUmSJEmLbXxamt9dVZfPsPxy4MQk2wMr59rBvEVzkpuB7uJ4e+C+wLOTVFXt0kfg+Y51JbAGuLY57pqqOmnaOscAxwCs3Ofei3VoSZIkSVIvwtiMnt1dMCfZEVhZVd/ten4jcNVc+5i3aK6qnbcmZA8OT3J68/hBwGFVdX2St8ySZzWwGmDVwQ9ajJZuSZIkSVLPMnajZze9pt9FpxH4PkkOBP6uqua95Lif0bOT5AVJ/qaZ3yfJIxcaussZVXVEVR0BXFRV1y/CPiVJkiRJmvJm4JHADQBVdTGwXy8b9vP1wD8Dvwk8v5m/BfhAH9v3Ys5RyyRJkiRJIyATC5vas6mqblzIhv2Mnn1IVR2c5CKApgv19gs5qCRJkiRpTCXjNBDYlMuTPB9YluT+wCuBc3rZsJ9XekeSZTSDgiXZg61sGa6qH3WPnF1Vq7oev2X6IGCSJEmSpHYFSLKgqUV/BjwY2AD8C3Aj8OpeNuynpfl9wOeBeyX538Czgb/uL6ckSZIkaeyNWUtzVa0H3tRMfennPs2fSnIB8Dg6Xy78XlWt7feAkiRJkqRxlravT+5bklOB51TVDc383YFPV9UT59u231e6O7C+qt4PrEtyn77TSpIkSZI0XLtPFczQGaMLuGcvG/Zzy6k3A68H3tAsWg58so+QkiRJkqRxF8Zx9OzJJCunZpLsSzNe13z6uab5SOAg4EKAqro6yc79pJQkSZIkjbuQMbummc61zGcnOaOZfzRwTC8b9lM0b6yqSjI1evZd+8soSZIkSdomjNk1zVV1SpKDgUPptJW/pqrW9bJtP0XzZ5L8/8CuSf4EeDHwob7TSpIkSZLG25gVzY0dgOvo1MH7J6Gqzpxvo35Gz/6HJE8AbgIeAPxtVZ260LSSJEmSpDGUsRw9+1jgucAVwGSzuIDFK5qTvBg4q6r+YiEhJUmSJElqye8BD6iqDf1u2E/37P2AFzSjjF0AnEWniL6434NKkiRJksbYmLU0Az+gcweowRXNVfW3AEl2BP4E+AvgH4Fl/R5UkiRJkjSuxnL07PXAxUm+TlfhXFWvnG/Dfrpn/zVwGLATcBHwOjqtzZIkSZKkpSRpO0G/vtRMfeune/YzgU3AV4AzgHOr6vaFHFSSJEmSNKbC2HXPrqqPLXTbfrpnH5xkZ+BRwBOADyX5eVU9aqEHlyRJkiSNm7EcPfv+wDuA/YEVU8ur6r7zbdtP9+yHAL8NHA6sAn6K3bMlSZIkSaPvo8CbgfcCjwGOptNmPq9+umcfS6db9vuAb1fVHX2GlCRJkiRtC8aspRnYsaq+niRV9WPgLUnOolNIz6mf7tm/O9fzSU6oqmf1ur9FMbmZbLhxqIecMcat69qOAMDmG0cjx6br2/83Adi8cXL+lYZg2W77th0BgHU/3th2BABuurXvUf4HYuP60ciR9e0PDVHbeROEbstvurXtCADcMVltRwCgtu/n+/UBGpFRWje1HaCxYUTej82bNrcdAYAdVixvOwIAu+20Q9sRAHjonivmX2kI7rvLaJxPd+YXbUfYNo1f0Xx7kgng+0leAfwXcM9eNlzMVzpvX3BJkiRJ0rhrrmleyDTfnpN9kpyWZG2SK5K8aoZ17pbky0kuadY5uofQrwbuArwSeDjwAuCFvbzaxSyaR+MrcUmSJEnS4EyNnj2AoplOp57XVtWDgEOBlyfZf9o6LweurKoDgCOAdyfZfp797ldVt1TVz6rq6KaX9MpeAo1dm7okSZIkqU2hU0ouZJpbVV1TVRc2j28G1gJ7TV8N2DlJgJ2A65j/Cpo39LjsThbzQqWxu7u1JEmSJGmodk+ypmt+dVWtnmnFJPsBBwHnTXvq/cCXgKuBnYHnVtWMAxoleTLwFGCvJO/remoXehyqYt6iOcnXq+pxSY6tqtfPsepcz0mSJEmSthULHwhsXVWtmnf3yU7ACcCrq+qmaU8/EbgYeCxwP+DUJGfNsB50Cus1wNOBC7qW3wy8ppfAvbQ075nkcODpST7NtBblrqbzr/VyQEmSJEnSmBvg6NlJltMpmD9VVSfOsMrRwDurqoCrkvwQeCBw/vQVq+oS4JIk/zJ12+Qkdwf2qarre8nTS9H8t8BfAXsD75megU51L0mSJElaEjKworm5TvkjwNqqml5/TvkJ8DjgrCT3Ah4A/GCeXZ+a5Ol0auCLgWuTnFFVfz5fpnmL5qr6HPC5JH9TVW+bb31JkiRJ0jZsavTswTgMOAq4LMnFzbI30ox0XVXHAW8Djk9yWZPm9VW1bp793q2qbkryEuCjVfXmJJf2EqjngcCq6m1NZf7oZtHpVXVSr9tLkiRJkrYFU6NnL76qOpt5BpmuqquB3+lz19sl2RP4feBN/WzY8ytN8g7gVcCVzfSqZpkkSZIkSaPs74CvAldV1beT3Bf4fi8b9nPLqd8FDpwayjvJx4CL6PHeVpIkSZKkbcQABwIbhKr6LPDZrvkfAM/qZdt+79O8K50bRwPcrc9tJUmSJEnbgjEpmpP8ZVX9fZJ/ojOQ9Raq6pXz7aOfovkdwEVJTqPTx/zR2MosSZIkSUtMqLkvOx4la5ufaxa6g34GAvvXJKcDj+BXI5T999TzSR5cVVcsNIgkSZIkaUyMSUtzVX25+fmxhe6jr+7ZVXUN8KVZnv4EcPBCg0iSJEmSxsV4FM1JvswM3bKnVNXT59tHv9c0z5lnEfclSZIkSdLW+ofm5zOBewOfbOafB/yolx0sZtE8a/UuSZIkSdpWZJy6Z58BkORtVfXorqe+nOTMXvaxmEWzJEmSJGkJqDHpnt1ljyT3bW41RZL7AHv0suFiFs0bZ3siyfbA15rZhwMXNI/XVNXrFjGDJEmSJGmQAmTsrs59DXB6kh808/sBx/SyYc9Fc5ITgP8LnFxVk9Ofr6pDZ9u2qjYCRzT7WVNVRyQ5Anhqr8eXJEmSJI2CMC4DgU2pqlOS3B94YLPoO1W1Yer5JE+oqlNn2rafV/pB4PnA95O8M8kD59ugBw9J8sUklyR5aBP2l/fPSnLuIhxDkiRJkrRICqhMLGhqNXfVhqq6pJk2THv62Nm26zl1Vf1HVf0hndtK/Qg4Nck5SY5OsnxBqWF5VT0DeB1wdC8bJDkmyZoka679nxsXeFhJkiRJkn5p1v7mfZX6SXYDXgS8BLgI+D90iugZm7F7cHHz86fA3Wc65PQFVbW6qlZV1ao9drvbAg8rSZIkSVqYqe7ZC5lG1qx3g+rnmuYT6fT//gTwtKq6pnnq37q7VG9FsKkCeXOSXZrH91/gfiVJkiRJA1KzN8xuc/oZPfv9VfWNmZ6oqlWLlAfg/cCZwBXA1Yu4X0mSJEnSYhiT+zT34UezPdFz0VxV30jyEGB/YEXX8o/3k2SqwK6q04HTm8ffodPtm6r6BJ3WbEmSJEnSCBq3luYkK4A/BR5Fp8fz2cAHq+p2gKp65mzb9tM9+810bhu1P/DvwJObA/VVNEuSJEmSxtn43XKKTt16M/BPzfzz6DTWPme+Dfvpnv1s4ADgoqo6Osm9gA/3GVSSJEmSpGF7QFUd0DV/WpJLetmwn68HbquqSWBTM1DXL4D79rG9JEmSJGkbUEwsaGrRRUkOnZpJcgjwzV427KeleU2SXYEPARcAtwDn95NSkiRJkrQtGK9rmoFDgBcm+UkzvxJYm+QyoKrqYbNt2M9AYH/aPDwuySnALlV16UITS5IkSZLGUdpuNV6IJy10w3mL5iQHz/VcVV240INLkiRJksZLAZXxaGlOsktV3URnELA7qarr5ttHLy3N725+rgBWAZfQaYt/GHAenSG7JUmSJElLxBjdcupfgKfSucS42LJfedHDOF3zFs1V9RiAJJ8Gjqmqy5r5hwCv6z+zJEmSJEmDV1VPTRLg8Kr6ybwbzKCfjugPnCqYm4NfDhy4kINKkiRJksbV1H2aFzINX1UV8PmFbt/P6Nlrk3wY+CSdZuwXAGsXemBJkiRJ0ngao+7ZU85N8oiq+na/G/ZTNB8N/H/Aq5r5M4EP9ntASZIkSdJ4qxq7ovkxwEuT/Bi4lU5z+Zy3mprSzy2nbgfe20x3kuSEqnpWr/uTJEmSJI2nMbzl1JMXumE/Lc3zmXfUMUmSJEnSeCsyjt2z/1dVHdW9IMkngKNmWf+XFvPrgVrEfUmSJEmStFge3D2TZBnw8F42HLs2dUmSJElSu6Zam/ud5pNknySnJVmb5Iokr5plvSOSXNysc8Yc+3tDkpuBhyW5qZluBn4BfLGX17qY3bPHrn1ekiRJktS/AXbP3gS8tqouTLIzcEGSU6vqyqkVkuwK/DPwpKr6SZJ7zpqz6h3AO5K8o6resJBAfbU0J9k+ycOSPDTJ9tOefv1CAkiSJEmSxsugWpqr6pqqurB5fDOd2xzvNW215wMnVtVPmvV+Mdv+kuzXrDNjwZyOvefK1HNLc5LfBY4D/pNOq/J9kry0qk5uQnyt130tnoLJTcM/7HSTm9tOAEBtHoH3AujcO7x9mRiNzg+1Yo+2IwBw0+2j8fuxYeNo5KiNd7QdAYCJEXg/JtsO0Mim0TiX7nD7xrYjjJRNK3ZoO0LHiJzTJydH4y9mVH5La0T+bu/YaUXbEQCYGJHf0+28AHMLE5Mb2o6wzSmyNbec2j3Jmq751VW1eqYVm4L3IOC8aU/9BrA8yenAzsD/qaqPz3K8dyWZoNMV+wLgWmAF8Ot0bkP1OODNwM9mC9xP9+x3A4+pqquaF3A/4CvAyX3sQ5IkSZI05raie/a6qlo130pJdgJOAF5dVTdNe3o7OoN4PQ7YEfhWknOr6nt3yln1nCT7A38IvBjYE7iNTgv2V4D/3dxeeVb9FM2/mCqYGz+gc/G0JEmSJEmLIslyOgXzp6rqxBlW+Rmd4vtW4NYkZwIHAHcqmgGa66HftNA8/RTNVyT5d+AzdG4v9Rzg20me2QSZ6cVIkiRJkrYxgxoILEmAjwBrq+o9s6z2ReD9SbYDtgcOAd7bw75/C9iPrjp4jm7dv9RP0bwC+DlweDN/LXAP4Gl0imiLZkmSJElaArbimub5HAYcBVyW5OJm2RuBlZ3j1nFVtTbJKcCldIZn+XBVXT7XTpN8ArgfcDEwNSBDAYtXNFfV0b2uK0mSJEnadg2qpbmqzqaH2xlX1buAd/Wx61XA/rWAUYv7GT37o3Qq8S1U1Yv7PagkSZIkaTwVo3P3jT5cDtwbuKbfDfvpnn1S1+MVwJHA1f0eUJIkSZI0xmqrbjnVlt2BK5OcD/zyPmRV9fT5Nuyne/YJ3fNJ/hX4jz5CSpIkSZLUhrcsdMN+Wpqnuz/NxdiSJEmSpKVjUNc0D0pVnbHQbfu5pvlmtrym+b+B1y/0wJIkSZKk8VPAZN/DabUryaHAPwEPonObqmXArVW1y3zb9tPSvEdV3T7twPfoJ6gkSZIkafyN4TXN7wf+APgsnZG0X0in9/S8Jvo4yAnNzaMBSHJv4NQ+tpckSZIkbQMmFzi1qaquApZV1eaq+ihwRC/b9dPS/AXgc0meBewDfAl4Xb9BJUmSJEnjbQxbmtcn2R64OMnf07n11F172bCf0bM/1BzkC8B+wEur6pwFhJUkSZIkaZiOotPT+hXAa+g0BD+rlw3nLZqT/Hn3bLPzi4FDkxxaVe/pO64kSZIkaSwV7Xe17ldV/TjJjsCeVfXWfrbt5ZrmnbumnYDPA1d1LZMkSZIkLSFVWdDUliRPo9P4e0ozf2CSL/Wy7bwtzf1W4ZIkSZKkbViN3y2ngLcAjwROB6iqi5Ps18uGPY+eneTUJLt2zd89yVf7SSlJkiRJGm8FVC1satGmqrpxIRv2c8upParqhqmZqroeuOdsKyf5SJJDmsevS3Ji83i7JJfMdaAkD0xyfB/ZJEmSJElDESYXOLXo8iTPB5YluX+SfwJ6Gti6n6J5c5KVUzNJ9qXzJcNszgUOaR4f0LX8YcClfRxXkiRJkqSt8WfAg4ENwL8ANwKv6mXDfu7T/Cbg7CRnNPOPBo6ZY/3zgL8C3gesAP6zKboPAa5K8gVgF+Bq4I/ojMz9aWBXYG0fuSRJkiRJQzSG1zTv30zbNdMzgKfTadSdUz/3aT4lycHAoXQK3NdU1bo5Nrkc2D/JvejcOPp8OgXzIcC+wNuq6htJXgsc2Wzzvap6Y5I/AQ6baadJjqEp1lfuvUev8SVJkiRJi6AYy6L5U8Dr6NSpfd0xa97u2Uke2Pw8GFhJp2X4v4CVzbIZVdUkcB3wVDoF81TR/FDgNuCtSU4Hfh+4N/DrwAXN5ufPsd/VVbWqqlbtsdvd5osvSZIkSVpMCxwErOWBwK6tqi9X1Q+r6sdTUy8b9tLS/Od0WnbfzZbXMKeZf+wc255Hp+/4c5qbSR8A3AJ8B/h8VZ0FkGQ5nebxg4ATgFW9hJckSZIkDd8YtjS/OcmHga/Tua4ZgEUhApkAABVMSURBVKo6cb4Ne7lP89R1y08B/hR4FJ1i+Szgg/Nsfh7w0qr6fjN/O50bSr8T+FCSqXtA/yXwBeAPknydTlEtSZIkSRoxY9o9+2jggcByftU9u4CtL5q7fAy4ic7AXgDPAz5Op3v1jKrqC3SK4an5p3U9/cwZNnl2H3kkSZIkSerFAVX10IVs2E/R/ICq6r511Gnz3W9ZkiRJkrTtGcOW5nOT7F9VV/a7YT9F80VJDq2qcwGSHAJ8s98DSpIkSZLG15h2z34U8EdJfkjnmuYAVVWLd8spOiNfvzDJT5r5lcDaJJf1ejBJkiRJ0pirsSyan7TQDfspmhd8EEmSJEnStmPciuZeby81k56L5q05iCRJkiRp2zCm3bMXbKLtAJIkSZIkjap+umdLkiRJkpa8YrKWTlOzRbMkSZIkqWdLrXu2RbMkSZIkqXcFmy2aJUmSJEm6s6XW0uxAYJIkSZKkkZBknySnJVmb5Iokr5pj3Uck2Zzk2YPMZEuzJEmSJKkvAxwIbBPw2qq6MMnOwAVJTq2qK7tXSrIMOBb46qCCTLFoliRJkiT1ZXJyMPutqmuAa5rHNydZC+wFXDlt1T8DTgAeMZgkv2LRLEmSJEnqWRVsXnhL8+5J1nTNr66q1TOtmGQ/4CDgvGnL9wKOBB6LRbMkSZIkadRsxUBg66pq1XwrJdmJTkvyq6vqpmlP/yPw+qranGTBQXpl0SxJkiRJ6lln9OzBDZ+dZDmdgvlTVXXiDKusAj7dFMy7A09JsqmqvjCIPBbNkiRJkqSRkE4l/BFgbVW9Z6Z1quo+XesfD5w0qIIZxr1ontzE5Prr2k7B5ltubDsCAJtvubXtCABsvnF92xEA2HDdxrYjNAY0SkKf7tg0GjlGxsRo3HGvJgbfpWj+DKPxXjAxGjd8nByR92NU/l0yqJFe+jUiMWpEbkxaG+9oOwIAtWlz2xEAyAicSwG2XzYaf7eabkROINuYzYN7Ww8DjgIuS3Jxs+yNwEqAqjpuYEeexXgXzZIkSZKkoaoaXPfsqjob6PmbsKp60UCCdLFoliRJkiT1ZVQ6Ig2DRbMkSZIkqWdFbc0tp8aORbMkSZIkqS+DHD171DhagSRJkiRJs7ClWZIkSZLUs6qBjp49ciyaJUmSJEl9WUrdsy2aJUmSJEk9K2DziNy3fhgsmiVJkiRJvStYQjWzRbMkSZIkqXdLraXZ0bMlSZIkSZqFLc2SJEmSpJ4VxWYHApMkSZIkaQYFk0uoe7ZFsyRJkiSpZ0vtmmaLZkmSJElSX5ZQzWzRLEmSJEnqXdXSaml29GxJkiRJkmZhS7MkSZIkqQ+Onj0QSbYHvtbMPhy4oHn81Kq6ZVg5JEmSJEkLVzh69kBU1UbgCIAka6rqiGEdW5IkSZK0SLymeTiSvCXJU5vHL0vyoubxG5OckeTMJA9tK58kSZIk6c4K2Fy1oGkcjdQ1zU2R/ICqOjzJvYEPAkdOW+cY4BiAlXvtNvyQkiRJkrTE2T17OLrf5TQ/HwT8VpLTm/nNd9qoajWwGmDVw+6zdP6lJEmSJElD12bRfD2wT/P44cDZwHeAM6rqJQBJlreUTZIkSZI0g6paUtc0t1k0fw74YnNd820AVXVpku8nOQOYBE4F3t5iRkmSJEnSNBbNA1ZVq5qHj5jhuWOBY4ebSJIkSZLUiwImx3RQr4UYqYHAJEmSJEkjbondcsqiWZIkSZLUs2JpjZ7d2n2aJUmSJEkadbY0S5IkSZL64OjZkiRJkiTNqLymWZIkSZKk2XlNsyRJkiRJMyhgc9WCpvkk2SfJaUnWJrkiyatmWOcPk1zaTOckOWAQr3OKLc2SJEmSpN5VMTk5Oai9bwJeW1UXJtkZuCDJqVV1Zdc6PwQOr6rrkzwZWA0cMqhAFs2SJEmSpJFQVdcA1zSPb06yFtgLuLJrnXO6NjkX2HuQmSyaJUmSJEk9K7ZqILDdk6zpml9dVatnWjHJfsBBwHlz7O+PgZMXGqYXFs2SJEmSpN7VVg0Etq6qVs23UpKdgBOAV1fVTbOs8xg6RfOjFhqmFxbNkiRJkqSebWVL87ySLKdTMH+qqk6cZZ2HAR8GnlxV/zOwMFg0S5IkSZL6NKhbTiUJ8BFgbVW9Z5Z1VgInAkdV1fcGEqSLRbMkSZIkqWdVNcj7NB8GHAVcluTiZtkbgZXNsY8D/hbYDfjnTo3Npl66fC+URbMkSZIkaSRU1dlA5lnnJcBLhpPIolmSJEmS1KfNg7tP88ixaJYkSZIk9ay2bvTssTPWRXNNTjK5/ua2Y1CbNrYdAYDJW9a3HQGATdePRo4rLri17QgA7PftL7cdAYAH7/3GtiMAcPWI/H5cf91o/H6s325Z2xFYcZcd2o4AwOZNm9uOAMD67Ufkv8btl7edAIBMzNlDbsnZbkT+XrSlu4zIv8uK5RNtRwBgu4nRKGYmGI2WyGxqv17YFlk0S5IkSZI0g6oa6C2nRo1FsyRJkiSpL0uppXk0+pBIkiRJkjSCbGmWJEmSJPWsgMkajWvWh8GiWZIkSZLUO0fPliRJkiRpZkVZNEuSJEmSNJvJSbtnS5IkSZJ0Z0use7ajZ0uSJEmSNAtbmiVJkiRJPSugllBLs0WzJEmSJKl3VV7TLEmSJEnSbJbSNc0WzZIkSZKknlVBbbZoliRJkiRpBkure7ajZ0uSJEmSNAtbmiVJkiRJfXH0bEmSJEmSZlIWzZIkSZIkzahzn2avaV40SfZLUkke08xvn+T6JK8Y9LElSZIkSYusippc2DSOhjUQ2Brgmc3jxwPfH9JxJUmSJEmLrCYnFzSNo2EVzT8GViYJcCTweYAkr0vyrSTnJHl4s+zCJB9Mcl6SNwwpnyRJkiRJdzLMW059C3g0sAdwDbAr8HTgMOAFwLHNersC7wR+E/iD6TtJckySNUnWrLvulmHkliRJkiR1sXv2YJwAvBc4vWvZJVU1WVU/AO7WLLu+qn5cVZPAbdN3UlWrq2pVVa3a/R47DTy0JEmSJKlLWTQPRFV9Hzgb+FzX4gOTTCS5L3DD1KrDyiRJkiRJ6lfB5OTCpjE01FtOVdUrATqXNnMD8EXgm3QK5T8bZhZJkiRJUv/K+zQvrqr6EfDsacuO75r9h2nPrep6fOggs0mSJEmSFmBMW40XYpjXNEuSJEmSNFaG2j1bkiRJkjTuxndQr4WwpVmSJEmS1LtiYAOBJdknyWlJ1ia5IsmrZlgnSd6X5KoklyY5eBAvc4otzZIkSZKk/gyupXkT8NqqujDJzsAFSU6tqiu71nkycP9mOgT4YPNzICyaJUmSJEm9qyIDGgisqq4Brmke35xkLbAX0F00PwP4eFUVcG6SXZPs2Wy76CyaJUmSJEn9WXhL8+5J1nTNr66q1TOtmGQ/4CDgvGlP7QX8tGv+Z80yi2ZJkiRJ0lhb132b4dkk2Qk4AXh1Vd00/ekZNhlYf3GLZkmSJElSzwID654NkGQ5nYL5U1V14gyr/AzYp2t+b+DqQeVx9GxJkiRJUu+KTvfshUzzSBLgI8DaqnrPLKt9CXhhM4r2ocCNg7qeGWxpliRJkiT1ZXADgQGHAUcBlyW5uFn2RmAlQFUdB/w78BTgKmA9cPSgwoBFsyRJkiSpHzW47tlVdTYzX7PcvU4BLx9IgBlYNEuSJEmS+jIxuPs0jxyvaZYkSZIkaRa2NEuSJEmSepbBXtM8ciyaJUmSJEm9q6XVPduiWZIkSZLUF1uaJUmSJEmaQaqYWEJFczqjdY+nJNcCP97K3ewOrFuEOFvLHFsyx5bMsSVzbMkcWzLHlsyxpVHIMQoZwBzTmWNL5tjSYuTYt6r2WIwwbUtyCp33ZCHWVdWTFjPPoI110bwYkqypqlXmMIc5zGEOc5jDHEslgznMYY7xzKF2eMspSZIkSZJmYdEsSZIkSdIsLJphddsBGubYkjm2ZI4tmWNL5tiSObZkji2NQo5RyADmmM4cWzLHlkYlh1qw5K9pliRJkiRpNrY0S5IkSZI0C4tmSZIkSZJmsaSL5iTvSnJWkk8l2b7FHDsnOS/JLUke0lKGhzfvxRlJPpNkeUs5HpLkm02OryTZqY0cXXme19wPvK3j75fk2iSnN1Nr9/ZLckSSrzf/Ns9oKcMju96L7yZ5b0s5JpJ8rPmbOSvJ/VrKsaw5f52e5Phh/t3OdN5K8twk5yT5RpJ9Wszx0ebv5hXDyDBTjiR3TfK1JGcmOS3Jfm3kaJad2fyOnDOs/2Nm+38tycokG9rMkeT7XeeRJ7SYY+8kX2pyvLmNHEm273ovzktyURs5mmWvSHJ+Mz2txRyvbf5Wvpbk14aU406fwVo6n86UY6jn0xky3K2lc+lM78XQz6UaIVW1JCfgIOCTzeM3Ac9vMct2wB7A8cBDWspwb+AuzeO3A89pKcfyrsdvBo5q8d9lAjgBuLDFDPsBn2vr+F05VgBfBrZvO0tXpg8Dh7d07IOBTzePnwC8t6UczwHe1jz+S+C5Qzz2FuctYDlwHrA9cBiwuo0czbI9gRcBr2jx/dgB2Kt57neAD7T4fixvfh4OfKStHM3yDwDfGNb/dbO8H2uG9XsxT45/nfodaTNH13MvAN7c4vtxZbN8F+BbbeSg81noG0CARwIfHFKOO30Ga+l8OlOOoZ5PZ8jw3JbOpTO9F0M/lzqNzrSUW5p/E/ha8/gU4LfaClJVm6qqtdbMJsN/V9X6ZvYOYFNLOe7omr0L8J02cjSeD3wOmGwxA8Bhzbedb0+SljL8FnAb8OUkn09y75ZyAJBkO+BQ4KyWIvysyRFgV6Ctv9/7Ahc3jy8EfntYB57hvHV/4Iqq2lhV3wQe2lIOquqaYRx7rhxVtaGq/quZHdo5dZb3Y+q8ugtwWVs5ktwHKOAnw8gwWw5gp6b16F+S3KONHOn0CtkPeHfTkjiUzyDzfN54DvDZFnNcBewI7Az8T0s59qVzHis659RHDSnH9M9gv0E759M7fRYc9vl0hgwbWzqXzvReDP1cqtGxlIvmXYGbmsc3AkP5j3PUJVkJPB44qcUMT2i6iD0G+M+WMiwDfh/4tzaO3+Ua4NeBRwP3BI5sKce9gPsAT6Nzy4W3tJRjymOBM6qqrS801tH5MmUt8Pd0WinasJbOewGdv9tdW8oBW55TAZa1FWSUNMXR3wLvazHDHkm+CfwzcGZbOYDXA//Q4vGnHFZVh9P5wvwtLWXYHXgY8Do6X9D+Y0s5gE43ZWCfqrqyxRin0Glt/jbt/b38J/CIJDvQOafefZgH7/oMdjYtnk9H5LPgFhnaOpd25xihc6lasJSL5uvpfFMEnQ9717WYZSQk2QX4BHD0tBbfoaqqU6vqIDqtvMe0FOMFwGdaLMqAX7ZW3dp8630CcGBLUW4Azq6qjXS6ru3fUo4pQ2sRmcUTgduq6oHAs4D3tJTjJGBjkm/Q6Znx3y3lgC3PqQCb2woyYlYDx1VVK18AAlTVtVV1GJ3f1be3kSHNdf9V9aM2jt+tqqZaMT9Lu+fU71XVz6rqv4FNTQ+atjwd+FJbB28+fxxDp8fKA4FWelZV1TrgODo9EZ8IfHdYx+7+DAb8gpbOp6PwWXCWDEM/l07PMQrnUrVnKRfN59K5NgI6J8ZvtpildU3L6qeAv6uq77WYY4eu2RuBW1uKsj/wwiSnAPdPewNO7dw1+2g63dfacD6/KpQPAn7QUo6prtm/Sfvf8l7f/LyBllp4q2qyql5TVY+l88XfF9rI0bgK2L8ZWOgw4NIWs4yEJH8N/LCqWuuxkmS7JFP/17d5Tj0AeHBzTn0CcFxaGHCy+f2c+n+mtXNqVd0G3NAMcnRXOuNFtHJZVKPtLyIngduBDcB6OmMCtHI5UlUd3/RE+CKdL4kHbobPYK2cT0fhs+BMGdo4l07PMULnUrUknQaspSnJu+hcF/kTOt8ibWwxy7/T+cb7x3QGnvj4kI//POD9/OoajQ+28UEvyVOBv6DzH+i1wIu6rilpRZI1VbWqpWM/GfhfdD5E/BB4cVsfrJK8nM6AHJNNjlYK53RGuz2yqv60jeM3GZbR+fZ5Lzof7v68qs5pIce9gU/TucbrP6rqnUM+/hbnLTofeF9N58PvC6vqpy3leBCdlrNlwFeq6rUt5fgIv/pC9ltV9YYWchwHvJjO3+0k8PKqGspYEbP9v5bkeOAfquryFnIcB7ySzgfeDXTOZW39nl4FHEtnEL23VtXJLeX4PJ3LXQ4exvHnyHFP4Nl0/m4/UlXHtZTjKXQGB/sxnb+X24aQ4U6fwZqfQz2fzpLjQIZ4Pp0hw0fpDPw51HPpLDlaOZdqNCzpolmSJEmSpLks5e7ZkiRJkiTNyaJZkiRJkqRZWDRLkiRJkjQLi2ZJkiRJkmZh0SxJkiRJ0iwsmiVJkiRJmoVFsyRp7CT5cJL953j+LUleN6BjvyjJ+wexb0mSNHq2azuAJEn9qqqXtJ1hsSXZrqo2tZ1DkiRtyZZmSdLISrJfku8k+ViSS5N8LsldkpyeZFWzzpOSXJjkkiRfn2Eff5Lk5CQ7Jrmla/mzkxzfPD4+yXFJzkryvSRPnSfaryU5Jcn3k/x91z6fl+SyJJcnObZr+VzHfU+S04BjkSRJI8eWZknSqHsA8MdV9c0k/xf406knkuwBfAh4dFX9MMk9ujdM8grgd4Dfq6oNSeY6zn7A4cD9gNOS/HpV3T7LugcCBwEbgO8m+SdgM53C9+HA9cDXkvxeVX1hntf3G8Djq2rzPOtJkqQW2NIsSRp1P62qbzaPPwk8quu5Q4Ezq+qHAFV1XddzRwFPBp5VVRt6OM5nqmqyqr4P/AB44Bzrfr2qbmyK6iuBfYFHAKdX1bVNN+tPAY/u4biftWCWJGl0WTRLkkZdzTGfGZ6fcjmd1uO9Z9l2RR/Hma67CN9Mp+fWXM3Ycx331jm2kyRJLbNoliSNupVJfrN5/Dzg7K7nvgUcnuQ+ANO6Z18EvBT4UpJfa5b9PMmDkkwAR047znOSTCS5H3Bf4Lt95jyvybJ7kmVN1jN6OK4kSRphFs2SpFG3FvijJJcC9wA+OPVEVV0LHAOcmOQS4N+6N6yqs4HXAV9JsjvwV8BJwDeAa6Yd57t0ityTgZfNcT3zjKrqGuANwGnAJcCFVfXF5um5jitJkkZYqubqfSZJUnuS7AecVFUPGfBxjm+O87lBHkeSJI0fW5olSZIkSZqFLc2SJM0gyRO5872Tf1hVXpMsSdISYtEsSZIkSdIs7J4tSZIkSdIsLJolSZIkSZqFRbMkSZIkSbOwaJYkSZIkaRb/D14Tz0Maw0FxAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 1080x360 with 2 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "# Mean trip distance per hour for a given day of the week\n",
    "df.plot('pickup_hour', 'pickup_day_of_week', what='mean(trip_distance)', \n",
    "        colorbar=True, colormap=cm_plusmin, figsize=(15, 5))\n",
    "\n",
    "plt.xticks(np.arange(24), np.arange(24))\n",
    "plt.yticks(np.arange(7), weekday_names_list)\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Groupby examples"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:13:50.662804Z",
     "start_time": "2020-06-10T17:13:41.910801Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead>\n",
       "<tr><th>#                             </th><th>pickup_hour  </th><th>tip_amount        </th><th>tip_amount_weekend  </th></tr>\n",
       "</thead>\n",
       "<tbody>\n",
       "<tr><td><i style='opacity: 0.6'>0</i> </td><td>0            </td><td>1.0583148055719318</td><td>1.028166641654939   </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1</i> </td><td>1            </td><td>1.0395988285791358</td><td>1.0505457487273462  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>2</i> </td><td>2            </td><td>1.0271794254275997</td><td>1.0636084270722341  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>3</i> </td><td>3            </td><td>1.0004258190973754</td><td>1.050149848503694   </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>4</i> </td><td>4            </td><td>0.9259400499895432</td><td>0.9313596322247085  </td></tr>\n",
       "<tr><td>...                           </td><td>...          </td><td>...               </td><td>...                 </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>19</i></td><td>19           </td><td>1.0345855021860877</td><td>1.0917831661780708  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>20</i></td><td>20           </td><td>1.0202804182400866</td><td>0.911085101443365   </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>21</i></td><td>21           </td><td>1.0258125215768232</td><td>0.8084636210998097  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>22</i></td><td>22           </td><td>1.0711251054555473</td><td>0.9458610337789833  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>23</i></td><td>23           </td><td>1.077161833123374 </td><td>0.9798870401138848  </td></tr>\n",
       "</tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "#    pickup_hour    tip_amount          tip_amount_weekend\n",
       "0    0              1.0583148055719318  1.028166641654939\n",
       "1    1              1.0395988285791358  1.0505457487273462\n",
       "2    2              1.0271794254275997  1.0636084270722341\n",
       "3    3              1.0004258190973754  1.050149848503694\n",
       "4    4              0.9259400499895432  0.9313596322247085\n",
       "...  ...            ...                 ...\n",
       "19   19             1.0345855021860877  1.0917831661780708\n",
       "20   20             1.0202804182400866  0.911085101443365\n",
       "21   21             1.0258125215768232  0.8084636210998097\n",
       "22   22             1.0711251054555473  0.9458610337789833\n",
       "23   23             1.077161833123374   0.9798870401138848"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df_per_hour = df.groupby(by=df.pickup_hour).agg({'tip_amount': 'mean',\n",
    "                                                 'tip_amount_weekend': vaex.agg.mean('tip_amount', \n",
    "                                                                                     selection='pickup_is_weekend==1')\n",
    "                                                })\n",
    "\n",
    "# Display the grouped DataFrame\n",
    "df_per_hour"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:14:45.087860Z",
     "start_time": "2020-06-10T17:14:44.760150Z"
    }
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAA+gAAAFgCAYAAAAo31N4AAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4yLjEsIGh0dHA6Ly9tYXRwbG90bGliLm9yZy+j8jraAAAgAElEQVR4nO3debhkZXX3/e+PBiIogwo4MNjog0Q0jg3oE0eIMqlAQAUVFSWtJqgYJxJ9EnmNiUaJQzQgQVAQBYNgQEEwKmiiKIPMoCIgNCANDqAYQWC9f+zdUBzOOV2nhj67u76f66rr1J5Wrarap+5ae9/7rlQVkiRJkiRpfq023wlIkiRJkiQLdEmSJEmSOsECXZIkSZKkDrBAlyRJkiSpAyzQJUmSJEnqAAt0SZIkSZI6wAJd0oyS/G2Sw+c7D0mSxsW2rpHkj5JcmuThK/AxK8n/GfNjLGwfZ/UxxO4r/yRvTvKBUT++Vk0W6NI8SXJ1kjuSbDBl/vntB/7CFZzPc5Ms6Z1XVf9YVfutyDxWtPZ9+LP5zkOSVkW2dd3QZ1u3GPh2Vf18ReQ0YQ4DXplko/lORN1ngS7Nr6uAvZdNJPkTYK35S0eSpJGzrVs5vB44er6TWBVV1e+BU4FXzXcu6j4LdGl+Hc19P6xfDRzVu0Lb5ezDSa5JcmOSQ5Os1S57cJKvJLkpya/a+5v0bHtGkvcl+Z8kv0ly+tSzGO16D6RpOB6Z5Lft7ZFJ3pvkc+06y7qILU5yfZIbkrxtpieWZJckP0xya5Jrk7y3Z9myWPu2y36V5A1Jtk5yYZJfJ/lEz/qrJXlPkp8lWZrkqCTrtcvudzak90xB+xy+2G7zmySXJFnULjsa2Aw4uX3O75z97ZIkDcC2ruNtXZLNgMcA32+nN2/zW62dPjzJ0p71P5fkgPb+ekk+3b5W1yX5hyQLetZ9bZLL2ud/WpJHzfBaPrN9nZ63vO3a1/UNSX7SLv9kkrTLFrT70s1JrgR2men9a9d/XLsP/bp93V7cs+wzbeyvtq/r95M8ZpoYW7f77eo98/ZIcn7PamcsLxcJLNCl+XYWsG7bOCwAXgZ8bso6HwQeCzwZ+D/AxsDftctWA44EHkXT+P4v8Ikp278c2BfYCFgTePvUJKrqNmAn4PqqelB7u36GnJ8HbAG8ADgwM3eZu43mC9n6NA3SG5PsNmWdbdtYLwM+Crwb+DPg8cBLkzynXe817e15wKOBB03zPGfzYuDYNpeTlm1bVfsA1wAvap/zP88hpiSpP7Z13W/r/gS4sqrubLe5CrgVeEq7/FnAb5M8rp1+NnBme/+zwJ0079tTaF6z/QDa1+JvgT8HNgS+A3xh6oMn2aGdv0dVfavP7V4IbA08CXgpsEM7/y/aZU8BFgF7zvSCJVkDOBk4nWbfeRNwTJIte1bbGzgIeDBwBfD+qXGq6mzgF8Dze2a/kvv2SLiszVWalQW6NP+WnVl4PnA5cN2yBe3R4L8A3lpVv6yq3wD/COwFUFW/qKovVdXv2mXvB54zJf6RVfXjqvpf4Is0X36GcVBV3VZVF9F8Ydp7upWq6oyquqiq7q6qC2ka1qm5va+qfl9Vp9N8yflCVS2tqutoGuNlXwxeAfxLVV1ZVb8F/gbYK/0P+PLfVXVKVd1F83rbQErSimVb1+22bn3gN1PmnQk8J/cOGnd8O705sC5wQZKH0Rz0OKB9vZYCH6F972i6zf9TVV3WFv//CDx5yln0l9Bco71zVf1gDtt9oKp+XVXXAN/i3vf8pcBHq+raqvol8E+zPO+n0xwI+UBV3VFV3wS+wn3f7xOq6gdtHscw8771WZqinCQPoTlg8Pme5b8B1pslFwmAkY9mKGnOjga+DWzOlC5/NEeN1wbObXtuAQRYAJBkbZqGcEeaI7sA6yRZ0DbQAL2DvfyOpiEaxrU9939Gc9T9fpJsC3wAeALN2Yw/Av5jymo39tz/32mml+X6yPaxeh93deBhfeY89TV4QJLVl50pkCSNnW1do6tt3a+AdabMO5PmrPwSmvfuDGAf4PfAd6rq7rZgXgO4oee9W417X79HAR9LcnBP3ND0kFj2XA8AjmoPhjCH7WZ6zx/J/d+/mTwSuLaq7p6y/sY90/3uW58DLkvyIJqDBN+pqht6lq8D3DJLLhLgGXRp3lXVz2gG0NkZOGHK4ptpGu/HV9X67W29qlrWOLwN2BLYtqrWpelyBk0jNudU+lxv0577mwEzdQ/8PE0Xu02raj3g0AHzon2M3qPmm9F0p7uR5mzE2ssWtN0nN5xD7H6ftyRpQLZ1fZnPtu5C4NFTztafSdO1/bnt/f8G/pSmh8Cy7u3XArcDG/S8d+tW1eN7lr++Z9n6VbVWVX2353FeAuy27Jr2OWw3kxu4//s3k+uBTZdda9+z/nUzrD+jtkfE94DdaQ5kTB1w73HABXONq8ljgS51w+uA7drr4+7RHtH9d+AjaX+aI8nG7bVa0ByN/V/g1213qr8fIocbgYcuG5BmFv8vydpJHk9zvd9xM6y3DvDLqvp9km1org8c1BeAt7aD1jyIpqvbce1ZgR/TnCXYpb2W7D00ZzD6dSPNtX6SpPGyrZvdvLV1VbUE+AmwTc+8n9C87q+k+fm1W9s4e9AW6O0Z4tOBg5Osm2agu8f0XFd/KPA37eu4bEC5l0x5+OuB7YE3J/nLOWw3ky+2sTZJ8mDgwFnW/T7NwY93JlkjyXOBF9Fcyz+Io4B30vS4OHHKsufQDFIozcoCXeqAqvppVZ0zw+J30QxKclaSW4H/ojmTAM1gM2vRnH04C/jaEDlcTvPl4Mp2JNNHzrDqmW0+3wA+3F5TN52/BP6/JL+hGejni4PmBhzBvd0jr6LpXvemNu9b2sc6nOaI92003fH69U/Ae9rnfL9BhSRJo2Fbt1zz3dZ9iubMb68zgV+013kvmw7ww551XkXTvf9Smq7yxwOPaPM+kWYAwGPb9/VimmvW76ONvz3wriT79bvdDP4dOI3mbPV53L/HRu/j3kHTjX8nmv3r34BXtfvJIE6k6QVxYu+BqCQPoOk98tkB42qCpMrenZKWL8lCmi8Ma3jttiRpVTTJbV2SP6IpvLefcu205iDJT2m65/9Xz7w30VwG4c+5arkcJE6SJEmacFV1O7DVfOexMkuyB831/t/snV9V/zo/GWllZIEuSZIkSUNIcgbNAY59powKL82JXdwlSZIkSeoAB4mTJEmSJKkDVrou7htssEEtXLhwvtOQJKmTzj333Jurai6/jzwWtteSJM1spvZ6pSvQFy5cyDnnzPQLHZIkTbYkP5vvHMD2WpKk2czUXtvFXZIkSZKkDrBAlyRJkiSpAyzQJUmSJEnqAAt0SZIkSZI6wAJdkiRJkqQOsECXJEmSJKkDLNAlSZIkSeoAC3RJkiRJkjrAAl2SJEmSpA6wQJckSZIkqQMs0CVJkiRJ6gALdEmSJEmSOmD1+U5AkiQ1ln7i1IG33Wj/nUaYiSRpUpx95NKBt916341GmInAM+iSJEmSJHWCZ9AlSRrCjR/73sDbPuwtzxhhJpIkaWXnGXRJkiRJkjrAAl2SJEmSpA5Yabu433TI5wbedsM3vnKEmUiSJEmSNDzPoEuSJEmS1AEr7Rl0SZKkQV3z8T0H3nazNx8/wkwkSbqXZ9AlSZIkSeoAC3RJkiRJkjrALu7AjYd8aOBtH/bGd4wwE0mSJEnSpLJAlySNzdlHLh1426333WiEmWhV0NUD6md/6kUDb7v1608eYSaSpJWdXdwlSZIkSeoAz6CP2DCjwoIjw0qSJEnSpLJAlyRNnJ8ffPnA2z78bX88wkwkSZLuZYEuSVopXP3Rnw+87cIDHj7CTCRJksbDAl2SJI3NTYd8buBtN3zjK0eYiSRJ3Te2QeKSHJFkaZKLZ1ieJB9PckWSC5M8dVy5SJIkSZLUdeMcxf0zwI6zLN8J2KK9LQYOGWMukiRJkiR12tgK9Kr6NvDLWVbZFTiqGmcB6yd5xLjykSRJkiSpy+bzGvSNgWt7ppe0826YumKSxTRn2dlss81WSHJdcPanXjTU9lu//uQRZSJJWlUlOQJ4IbC0qp4wzfIAHwN2Bn4HvKaqzluxWUqSNBnms0DPNPNquhWr6jDgMIBFixZNu440avueONsVGrM7cvevjTATSRqrzwCfAI6aYXnvJWnb0lyStu0KyUySpAkzzmvQl2cJsGnP9CbA9fOUiyRJE8lL0iRJ6o75LNBPAl7Vjub+dOCWqrpf93ZJkjSvZrokTZIkjdjYurgn+QLwXGCDJEuAvwfWAKiqQ4FTaK5nu4LmmrZ9x5WLJEkaWN+XpE3qmDGSJI3K2Ar0qtp7OcsL+KtxPb4kSRqJvi9Jc8wYSZKGM5+DxEkTY+cvv23gbU/Z7eARZiJJc3YSsH+SY2kGh/OSNEmSxsQCXZKkCeYlaZIkdYcFuiRJE8xL0iRJ6g4LdEnSPU497uahtt/pZRuMKBNJkqTuWfqJU4fafqP9d5p1+Xz+zJokSZIkSWpZoEuSJEmS1AEW6JIkSZIkdYAFuiRJkiRJHWCBLkmSJElSB1igS5IkSZLUARbokiRJkiR1gL+DLkmSJElaZd34se8Ntf3D3vKMEWWyfBbokrSSO/yEpUNtv9+fbzSiTCRJkjQMC/QJcdqndx5q+x1ed8o99487csehYr1s368Ntb0kSZIkrYq8Bl2SJEmSpA7wDLokSVJHDNPjrbe3myRp5WSBrlXK+4/bYeBt3/2y00aYiSRJkiTNjV3cJUmSJEnqAAt0SZIkSZI6wC7u0kpmlxM/NPC2X939HSPMRJIkSdIoWaBLkiRJkoZ29Ud/PvC2Cw94+AgzWXnZxV2SJEmSpA6wQJckSZIkqQPs4i5NsF2+dNjA2351j8UjzESSJEmSBbokzYM3n3jtUNt/fPdNR5SJJEmSusIu7pIkSZIkdYAFuiRJkiRJHWAXd0kj8cLjjxl426/s+YoRZiJJkiStnDyDLkmSJElSB3gGXVLnvOj4Lw+87cl77naf6d2O/8bAsb685/YDbytJkiTNlWfQJUmSJEnqAM+ga9596ugdBt729fucNsJMJEmSJGn+WKBLUp/2/NJ5A297/B5PHWEmkiRJWhXZxV2SJEmSpA6wQJckSZIkqQPs4i5JkiRJK5FTj7t54G13etkGI8xEo+YZdEmSJEmSOsACXZIkSZKkDrBAlyRJkiSpAyzQJUmSJEnqAAeJkyRJkqQxO/yEpQNvu9+fbzTCTNRlFuiSJEmroOOO3HHgbV+279dGmIkkqV9j7eKeZMckP0pyRZIDp1m+XpKTk1yQ5JIk+44zH0mSJEmSumpsBXqSBcAngZ2ArYC9k2w1ZbW/Ai6tqicBzwUOTrLmuHKSJEmSJKmrxtnFfRvgiqq6EiDJscCuwKU96xSwTpIADwJ+Cdw5xpwkSZI0R586eoeBt339PqeNMBNJWrWNs4v7xsC1PdNL2nm9PgE8DrgeuAh4S1XdPcacJElSDy9HkySpO8Z5Bj3TzKsp0zsA5wPbAY8Bvp7kO1V1630CJYuBxQCbbbbZGFKVJGny9FyO9nyaA+lnJzmpqnp7uy27HO1FSTYEfpTkmKq6Yx5S1irg/ccNfjb+3S+779n4fU8cfCA8gCN3n7zB8F54/DEDb/uVPV8xwkxWDm8+8drlrzSDj+++6QgzmTw/P/jyobZ/+Nv+eESZrFjjLNCXAL175SY0Z8p77Qt8oKoKuCLJVcAfAz/oXamqDgMOA1i0aNHUIl+SJA3Gy9EkDexFx3954G1P3nO3+0zvdvw3Bo715T23v8/0nl86b+BYx+/x1IG3lUZhnF3czwa2SLJ5O/DbXsBJU9a5BtgeIMnDgC2BK8eYkyRJutdIL0dLsjjJOUnOuemmm8aRryRJq7SxnUGvqjuT7A+cBiwAjqiqS5K8oV1+KPA+4DNJLqLpEv+uqrp5XDlJkqT7GNnlaGCPN2lcdvnSYQNv+9U9Fo8wE0njNs4u7lTVKcApU+Yd2nP/euAF48xBkiTNaGSXo0mSpOGNs4u7JEnqNi9HkySpQ8Z6Bl2SJHWXl6NJktQtFuiSJE0wL0eTJKk7LNAlSZI08Xb+8tuG2v6U3Q4eUSaSJpkFuiRJkjRCu5z4oaG2/+ru7xhRJpJWNg4SJ0mSJElSB1igS5IkSZLUARbokiRJkiR1gAW6JEmSJEkdYIEuSZIkSVIHWKBLkiRJktQBFuiSJEmSJHWABbokSZIkSR1ggS5JkiRJUgcst0BP8sF+5kmSJEmSpMH1cwb9+dPM22nUiUiSJEmSNMlWn2lBkjcCfwk8OsmFPYvWAf5n3IlJkiRJkjRJZizQgc8DpwL/BBzYM/83VfXLsWYlSZIkSdKEmbFAr6pbgFuAvZMsAB7Wrv+gJA+qqmtWUI6SJEmSJK3yZjuDDkCS/YH3AjcCd7ezC3ji+NKSJEmSJGmyLLdABw4AtqyqX4w7GUmSJEmSJlU/o7hfS9PVXZIkSZIkjUk/Z9CvBM5I8lXg9mUzq+pfxpaVJEmSJEkTpp8C/Zr2tmZ7kyRJkiRJI7bcAr2qDloRiUiSJEmSNMn6GcX9WzSjtt9HVW03lowkSZIkSZpA/XRxf3vP/QcAewB3jicdSZIkSZImUz9d3M+dMut/kpw5pnwkSZIkSZpI/XRxf0jP5GrA04CHjy0jSZIkSZImUD9d3M+luQY9NF3brwJeN86kJEmSJEmaNP10cd98RSQiSZIkSdIk66eL+xrAG4Fnt7POAD5VVX8YY16SJEmSJE2Ufrq4HwKsAfxbO71PO2+/cSUlSZIkSdKk6adA37qqntQz/c0kF4wrIUmSJEmSJtFqfaxzV5LHLJtI8mjgrvGlJEmSJEnS5OnnDPo7gG8luZJmJPdHAfuONStJkiRJkiZMP6O4fyPJFsCWNAX65VV1+9gzkyRJkiRpgvQzivsCYAdgYbv+9kmoqn8Zc26SJEmSJE2Mfrq4nwz8HrgIuHu86UiSJEmSNJn6KdA3qaonjj0TSZIkSZImWD+juJ+a5AVjz0SSJEmSpAnWzxn0s4ATk6wG/IFmoLiqqnXHmpkkSZIkSROknwL9YOAZwEVVVWPOR5IkSZKkidRPF/efABcPUpwn2THJj5JckeTAGdZ5bpLzk1yS5My5PoYkSZIkSauCfs6g3wCckeRU4J7fP1/ez6y1P8/2SeD5wBLg7CQnVdWlPeusD/wbsGNVXZNkowGegyRJkiRJK71+zqBfBXwDWBNYp+e2PNsAV1TVlVV1B3AssOuUdV4OnFBV1wBU1dJ+E5ckScOzt5skSd2x3DPoVXXQgLE3Bq7tmV4CbDtlnccCayQ5g6bo/1hVHTU1UJLFwGKAzTbbbMB0JElSL3u7SZLULcst0JNsCLwTeDzwgGXzq2q75W06zbyp17GvDjwN2B5YC/hekrOq6sf32ajqMOAwgEWLFjlQnSRJo3FPbzeAJMt6u13as4693SRJWkH66eJ+DHA5sDlwEHA1cHYf2y0BNu2Z3gS4fpp1vlZVt1XVzcC3gSf1EVuSJA1vut5uG09Z57HAg5OckeTcJK+aKViSxUnOSXLOTTfdNIZ0JUlatfVToD+0qj4N/KGqzqyq1wJP72O7s4EtkmyeZE1gL+CkKev8J/CsJKsnWZumC/xlc8hfkiQNbi693XYBdgD+X5LHThesqg6rqkVVtWjDDTccbaaSJE2AfkZx/0P794Yku9CcBd9keRtV1Z1J9gdOAxYAR1TVJUne0C4/tKouS/I14ELgbuDwqrp4kCciSZLmrN/ebjdX1W3AbUmW9Xb7MZIkaaT6KdD/Icl6wNuAfwXWBd7aT/CqOgU4Zcq8Q6dMfwj4UF/ZSpKkUbqntxtwHU1vt5dPWec/gU8kWZ3mF122BT6yQrOUJGlC9DOK+1fau7cAzxtvOpIkaUWxt5skSd3Szxl0SZK0irK3myRJ3dHPIHGSJEmSJGnMLNAlSZIkSeqA5RboSR6a5F+TnNf+/unHkjx0RSQnSZIkSdKk6OcM+rHAUmAPYE/gJuC4cSYlSZIkSdKk6WeQuIdU1ft6pv8hyW7jSkiSJEmSpEnUzxn0byXZK8lq7e2lwFfHnZgkSZIkSZOknwL99cDngduBO2i6vP91kt8kuXWcyUmSJEmSNCmW28W9qtZZEYlIkiRJkjTJZizQk/xxVV2e5KnTLa+q88aXliRJkiRJk2W2M+h/DSwGDp5mWQHbjSUjSZIkSZIm0IwFelUtbu/uVFW/712W5AFjzUqSJEmSpAnTzyBx3+1zniRJkiRJGtBs16A/HNgYWCvJU4C0i9YF1l4BuUmSJEmSNDFmuwZ9B+A1wCY016EvK9BvBf52vGlJkiRJkjRZZrsG/bPAZ5PsUVVfWoE5SZIkSZI0cZZ7DbrFuSRJkiRJ49fPIHGSJEmSJGnMLNAlSZIkSeqA2QaJu0eS/wss7F2/qo4aU06SJEmSJE2c5RboSY4GHgOcD9zVzi7AAl2SJEmSpBHp5wz6ImCrqqpxJyNJkiRJ0qTq5xr0i4GHjzsRSZIkSZImWT9n0DcALk3yA+D2ZTOr6sVjy0qSJEmSpAnTT4H+3nEnIUmSJEnSpFtugV5VZ66IRCRJkiRJmmTLvQY9ydOTnJ3kt0nuSHJXkltXRHKSJEmSJE2KfgaJ+wSwN/ATYC1gv3aeJEmSJEkakX6uQaeqrkiyoKruAo5M8t0x5yVJkiRJ0kTpp0D/XZI1gfOT/DNwA/DA8aYlSZIkSdJk6aeL+z7tevsDtwGbAnuMMylJkiRJkiZNP6O4/yzJWsAjquqgFZCTJEmSJEkTp59R3F8EnA98rZ1+cpKTxp2YJEmSJEmTpJ8u7u8FtgF+DVBV5wMLx5eSJEmSJEmTp58C/c6qumXsmUiSJEmSNMH6GcX94iQvBxYk2QJ4M+DPrEmSJEmSNEL9nEF/E/B44HbgC8CtwAHjTEqSJEmSpEnTzyjuvwPe3d4kSZIkSdIYLLdAT7II+FuageHuWb+qnji+tCRJkiRJmiz9XIN+DPAO4CLg7vGmI0mSJEnSZOqnQL+pqvzdc0mSJEmSxqifAv3vkxwOfINmoDgAquqEsWUlSZIkSdKE6WcU932BJwM7Ai9qby/sJ3iSHZP8KMkVSQ6cZb2tk9yVZM9+4kqSJEmStKrp5wz6k6rqT+YaOMkC4JPA84ElwNlJTqqqS6dZ74PAaXN9DEmSNJwkOwIfAxYAh1fVB2ZYb2vgLOBlVXX8CkxRkqSJ0c8Z9LOSbDVA7G2AK6rqyqq6AzgW2HWa9d4EfAlYOsBjSJKkAfUcTN8J2ArYe7o234PpkiStGP0U6M8Ezm+7ql+Y5KIkF/ax3cbAtT3TS9p590iyMbA7cOhsgZIsTnJOknNuuummPh5akiT1wYPpkiR1SD9d3HccMHammVdTpj8KvKuq7kqmW73dqOow4DCARYsWTY0hSZIGM93B9G17V+g5mL4dsPVswZIsBhYDbLbZZiNNVJKkSbDcAr2qfjZg7CXApj3TmwDXT1lnEXBsW5xvAOyc5M6q+vKAjylJkvo3soPp4AF1SZKG1c8Z9EGdDWyRZHPgOmAv4OW9K1TV5svuJ/kM8BWLc0mSVhgPpkuS1CFjK9Cr6s4k+9MMKLMAOKKqLknyhnb5rNedS5KksfNguiRJHTLOM+hU1SnAKVPmTVuYV9VrxpmLJEm6Lw+mS5LULWMt0CVJUrd5MF2SpO7o52fWJEmSJEnSmFmgS5IkSZLUARbokiRJkiR1gAW6JEmSJEkdYIEuSZIkSVIHWKBLkiRJktQBFuiSJEmSJHWABbokSZIkSR1ggS5JkiRJUgdYoEuSJEmS1AEW6JIkSZIkdYAFuiRJkiRJHWCBLkmSJElSB1igS5IkSZLUARbokiRJkiR1gAW6JEmSJEkdYIEuSZIkSVIHWKBLkiRJktQBFuiSJEmSJHWABbokSZIkSR1ggS5JkiRJUgdYoEuSJEmS1AEW6JIkSZIkdYAFuiRJkiRJHWCBLkmSJElSB1igS5IkSZLUARbokiRJkiR1gAW6JEmSJEkdYIEuSZIkSVIHWKBLkiRJktQBFuiSJEmSJHWABbokSZIkSR1ggS5JkiRJUgdYoEuSJEmS1AEW6JIkSZIkdYAFuiRJkiRJHWCBLkmSJElSB1igS5IkSZLUARbokiRJkiR1gAW6JEmSJEkdYIEuSZIkSVIHjLVAT7Jjkh8luSLJgdMsf0WSC9vbd5M8aZz5SJIkSZLUVWMr0JMsAD4J7ARsBeydZKspq10FPKeqngi8DzhsXPlIkqT782C6JEndMc4z6NsAV1TVlVV1B3AssGvvClX13ar6VTt5FrDJGPORJEk9PJguSVK3jLNA3xi4tmd6STtvJq8DTp1uQZLFSc5Jcs5NN900whQlSZpoHkyXJKlDxlmgZ5p5Ne2KyfNoCvR3Tbe8qg6rqkVVtWjDDTccYYqSJE20kR1MBw+oS5I0rHEW6EuATXumNwGun7pSkicChwO7VtUvxpiPJEm6r5EdTAcPqEuSNKxxFuhnA1sk2TzJmsBewEm9KyTZDDgB2KeqfjzGXCRJ0v15MF2SpA5ZfVyBq+rOJPsDpwELgCOq6pIkb2iXHwr8HfBQ4N+SANxZVYvGlZMkSbqPew6mA9fRHEx/ee8KHkyXJGnFGVuBDlBVpwCnTJl3aM/9/YD9xpmDJEmangfTJUnqlrEW6JIkqds8mC5JUneM8xp0SZIkSZLUJwt0SZIkSZI6wAJdkiRJkqQOsECXJEmSJKkDLNAlSZIkSeoAC3RJkiRJkjrAAl2SJEmSpA6wQJckSZIkqQMs0CVJkiRJ6gALdEmSJEmSOsACXZIkSZKkDrBAlyRJkiSpAyzQJUmSJEnqAAt0SZIkSZI6wAJdkiRJkqQOsECXJEmSJKkDLNAlSZIkSeoAC3RJkiRJkjrAAl2SJEmSpA6wQJckSZIkqQMs0CVJkiRJ6gALdEmSJEmSOsACXZIkSZKkDrBAlyRJkiSpAyzQJUmSJEnqAAt0SZIkSZI6wAJdkiRJkqQOsECXJEmSJKkDLNAlSZIkSeoAC3RJkiRJkjrAAl2SJEmSpA6wQJckSZIkqQMs0CVJkmqxYZwAAA4rSURBVCRJ6gALdEmSJEmSOsACXZIkSZKkDrBAlyRJkiSpAyzQJUmSJEnqAAt0SZIkSZI6wAJdkiRJkqQOsECXJEmSJKkDLNAlSZIkSeoAC3RJkiRJkjpgrAV6kh2T/CjJFUkOnGZ5kny8XX5hkqeOMx9JknRfttWSJHXH2Ar0JAuATwI7AVsBeyfZaspqOwFbtLfFwCHjykeSJN2XbbUkSd0yzjPo2wBXVNWVVXUHcCyw65R1dgWOqsZZwPpJHjHGnCRJ0r1sqyVJ6pBU1XgCJ3sCO1bVfu30PsC2VbV/zzpfAT5QVf/dTn8DeFdVnTMl1mKao/YAWwI/6iOFDYCbh34ixjLW5MYadTxjGctYKybWo6pqw34CjrKtbpfNtb1e2V/r+YhnLGMZqzuxRh3PWJMVa9r2evURJTGdTDNv6tGAftahqg4DDpvTgyfnVNWiuWxjLGMZa3zxjGUsY3UnVm/YaeYN1FbD3Nvrrr4+fpYay1jGmo94xjIWjLeL+xJg057pTYDrB1hHkiSNh221JEkdMs4C/WxgiySbJ1kT2As4aco6JwGvakeIfTpwS1XdMMacJEnSvWyrJUnqkLF1ca+qO5PsD5wGLACOqKpLkryhXX4ocAqwM3AF8Dtg3xGmMKcu8cYylrHGHs9YxjJWd2IBttUrKNao4xnLWMbqTqxRxzOWscY3SJwkSZIkSerfOLu4S5IkSZKkPlmgS5IkSZLUAatcgZ5kxyQ/SnJFkgOHjHVEkqVJLh4yzqZJvpXksiSXJHnLkPEekOQHSS5o4x00ZLwFSX7Y/tbtUJJcneSiJOcnud9v5M4x1vpJjk9yefvaPWPAOFu2+Sy73ZrkgCHyemv7ul+c5AtJHjBErLe0cS6Za07T7Z9JHpLk60l+0v598BCxXtLmdXeSvn8qYoZYH2rfxwuTnJhk/SFiva+Nc36S05M8cpjcepa9PUkl2WCI3N6b5LqefW3nYfJK8qb28+ySJP88RF7H9eR0dZLzh4j15CRnLfsfT7LNELGelOR77WfGyUnW7TPWtJ+pg+z/s8Sa8/4/S6w57/+zxBp4/++ajKi9nu3/eoBYI2uvM+K2uo05kvY6ttVzjTVwW91ub3s9h8+r2f6nY1vdb6yB2upZ4s25vZ7p83SQfX+WWKtmW11Vq8yNZoCbnwKPBtYELgC2GiLes4GnAhcPmdcjgKe299cBfjxkXgEe1N5fA/g+8PQh4v018HngKyN4D64GNhjR+/lZYL/2/prA+iPaR34OPGrA7TcGrgLWaqe/CLxmwFhPAC4G1qYZsPG/gC3msP399k/gn4ED2/sHAh8cItbjgC2BM4BFQ+b1AmD19v4Hh8xr3Z77bwYOHSa3dv6mNINk/azf/XeG3N4LvH2AfWG6WM9r94k/aqc3GuY59iw/GPi7IfI6Hdipvb8zcMYQsc4GntPefy3wvj5jTfuZOsj+P0usOe//s8Sa8/4/S6yB9/8u3Rhhe728fX6OsUbWXjPitrqNM5L2GtvqucQaqq1uY9hez+Hzaqb/aWyr55LXQG31LPHm3F5jWz3nfX/ZbVU7g74NcEVVXVlVdwDHArsOGqyqvg38ctikquqGqjqvvf8b4DKaxmPQeFVVv20n12hvA432l2QTYBfg8EHzGYf2yNyzgU8DVNUdVfXrEYTeHvhpVf1siBirA2slWZ2mwR7094AfB5xVVb+rqjuBM4Hd+914hv1zV5ovS7R/dxs0VlVdVlU/6jef5cQ6vX2OAGfR/I7yoLFu7Zl8IHPY92f5n/4I8M4RxZqzGWK9EfhAVd3errN02LySBHgp8IUhYhWw7Mj5evS5/88Qa0vg2+39rwN79Blrps/UOe//M8UaZP+fJdac9/9ZYg28/3fMyNrrEf8vjqy9HmVbDd1sr22r+2N7Dczh88q2ev7a6lnizbm9tq2+x5zb6lWtQN8YuLZneglDFMLjkGQh8BSaI+nDxFnQdn1ZCny9qgaN91GaD7u7h8mnRwGnJzk3yeIh4jwauAk4Mk13vsOTPHAE+e1Fnx9406mq64APA9cAN9D8HvDpA4a7GHh2kocmWZvmCOemg+bWeli1v0/c/t1oyHjj8Frg1GECJHl/kmuBVwB/N2SsFwPXVdUFw8TpsX/bremIfrptzeKxwLOSfD/JmUm2HkFuzwJurKqfDBHjAOBD7ev/YeBvhoh1MfDi9v5LGGD/n/KZOtT+P6rP5+XEmvP+PzXWKPf/eTQR7fUI22oYbXttW92/cbTVYHs91zi21XMzyrYahmyvbavntu+vagV6ppnXmbMLSR4EfAk4YMqRlTmrqruq6sk0R3e2SfKEAfJ5IbC0qs4dJpcp/rSqngrsBPxVkmcPGGd1mu41h1TVU4DbaLrBDCzJmjQfLv8xRIwH0xz52xx4JPDAJK8cJFZVXUbThebrwNdounjeOetGK7kk76Z5jscME6eq3l1Vm7Zx9h8in7WBdzO6IucQ4DHAk2m+FB48RKzVgQcDTwfeAXyxPao+jL0Z4ktv643AW9vX/620Z84G9Fqaz4lzabqG3TGXjUf5mboiYg2y/08Xa1T7/zybiPZ6FG11m8+o22vb6j5NYlsN3WqvbasHMsq2GoZor22r577vr2oF+hLue0RnEwbv0jRSSdageeOOqaoTRhW37Up2BrDjAJv/KfDiJFfTdC/cLsnnhszn+vbvUuBEmm6Mg1gCLOk523A8zZeAYewEnFdVNw4R48+Aq6rqpqr6A3AC8H8HDVZVn66qp1bVs2m6Ew1ztBTgxiSPAGj/9tXVakVI8mrghcArqmpUX8Q/T5/domfwGJovcBe0/webAOclefggwarqxvYL+d3AvzP4/g/N/8AJbTfZH9CcNetrUJzptN08/xw4boicAF5Ns99D8wV64OdYVZdX1Quq6mk0X0Z+2u+2M3ymDrT/j/LzeaZYg+z/feQ17P4/nyaqvR6yrYYRt9e21XMzhrYabK/nwrZ67kbWVsPg7bVtNTDAvr+qFehnA1sk2bw9ArsXcNI857TsWpJPA5dV1b+MIN6GaUcWTLIWTUN0+VzjVNXfVNUmVbWQ5rX6ZlUNdIS5zeWBSdZZdp9msIWBRtWtqp8D1ybZsp21PXDpoLm1RnFE8hrg6UnWbt/X7WmuNxlIko3av5vRfCAPm99JNB/KtH//c8h4I5FkR+BdwIur6ndDxtqiZ/LFDLDvL1NVF1XVRlW1sP0/WEIz2MfPB8ztET2TuzPg/t/6MrBdG/exNIMv3TxEvD8DLq+qJUPEgKaIek57fzuG+KLas/+vBrwHOLTP7Wb6TJ3z/j/Kz+eZYg2y/88Sa2T7/zxb5dvrUbXVMNr22rZ67sbQVoPtdd9sqwcysrYaBmuvbavvMfd9v+Y4gmHXbzTXBv2Y5sjOu4eM9QWari9/oPkweN2AcZ5J03XvQuD89rbzEHk9EfhhG+9i+hzlcTkxn8vwo8I+mqbr1wXAJSN4/Z8MnNM+zy8DDx4i1trAL4D1RvBaHdT+o10MHE07cueAsb5D82XmAmD7YfdP4KHAN2g+iL8BPGSIWLu3928HbgROGyLWFTTXmy7b//sdyXW6WF9qX/sLgZNpBuMY+DWbsvxq+h8ZdrrcjgYuanM7CXjEELHWBD7XPtfzgO2GeY7AZ4A3jGAfeyZwbrvPfh942hCx3kLzef1j4ANA+ow17WfqIPv/LLHmvP/PEmvO+/8ssQbe/7t2Y0Tt9fL+r+cYa2TtNWNoq9u4z2WI9hrb6kFiDdxWt9vbXs/h82p5/9PYVveT10Bt9Szx5txeY1s9cFudNrgkSZIkSZpHq1oXd0mSJEmSVkoW6JIkSZIkdYAFuiRJkiRJHWCBLkmSJElSB1igS5IkSZLUARbo0iomycIkw/ym5yhyeHOSy5Ics5z1zkiyaEXlJUlSF9hWS5rJ6vOdgKSVQ5LVq+rOPlf/S2CnqrpqnDlJkqR72VZLKz/PoEurpgVJ/j3JJUlOT7IWQJInJzkryYVJTkzy4Hb+PUfHk2yQ5Or2/muS/EeSk4HTpz5Ikr9OcnF7O6CddyjwaOCkJG+dsv5aSY5tH/84YK2eZYckOafN+aB23vZJTuxZ5/lJThjpKyVJ0vywrZZ0Pxbo0qppC+CTVfV44NfAHu38o4B3VdUTgYuAv+8j1jOAV1fVdr0zkzwN2BfYFng68BdJnlJVbwCuB55XVR+ZEuuNwO/ax38/8LSeZe+uqkXAE4HnJHki8E3gcUk2bNfZFziyj5wlSeo622pJ92OBLq2arqqq89v75wILk6wHrF9VZ7bzPws8u49YX6+qX04z/5nAiVV1W1X9FjgBeNZyYj0b+BxAVV0IXNiz7KVJzgN+CDwe2KqqCjgaeGWS9Wm+gJzaR86SJHWdbbWk+/EadGnVdHvP/bvo6Z42gzu594DdA6Ysu22GbTJAXgB1v0DJ5sDbga2r6ldJPtOTx5HAycDvgf+Yw7V1kiR1mW21pPvxDLo0IarqFuBXSZYdOd8HWHaE/mru7cK2Z58hvw3slmTtJA8Edge+08c2rwBI8gSaLnIA69J8ubglycOAnXryvp6mG957gM/0mZskSSsd22pJnkGXJsurgUOTrA1cSXOdGMCHgS8m2YfmWrLlqqrz2qPnP2hnHV5VP1zOZocARya5EDh/2bZVdUGSHwKXtHn9z5TtjgE2rKpL+8lNkqSVmG21NMHSXDYiSd2V5BPAD6vq0/OdiyRJuj/bamk0LNAldVqSc2m61D2/qm5f3vqSJGnFsq2WRscCXZIkSZKkDnCQOEmSJEmSOsACXZIkSZKkDrBAlyRJkiSpAyzQJUmSJEnqAAt0SZIkSZI64P8HU0HGXpFNPv0AAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 1008x360 with 2 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.figure(figsize=(14, 5))\n",
    "\n",
    "plt.subplot(121)\n",
    "sns.barplot(x=df_per_hour.pickup_hour.values, y=df_per_hour.tip_amount.values)\n",
    "plt.title('Mean tip amount')\n",
    "plt.xlabel('hour of day')\n",
    "plt.ylabel('mean tip amount')\n",
    "\n",
    "plt.subplot(122)\n",
    "sns.barplot(x=df_per_hour.pickup_hour.values, y=df_per_hour.tip_amount_weekend.values)\n",
    "plt.title('Mean tip amount (weekend only)')\n",
    "plt.xlabel('hour of day')\n",
    "# plt.ylabel('mean trip speed [miles per hour]')\n",
    "\n",
    "\n",
    "plt.tight_layout()\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Join"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:15:16.589065Z",
     "start_time": "2020-06-10T17:15:08.236459Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<table>\n",
       "<thead>\n",
       "<tr><th>#                                        </th><th>vendor_id  </th><th>pickup_datetime              </th><th>dropoff_datetime             </th><th>passenger_count  </th><th>payment_type  </th><th>trip_distance     </th><th>pickup_longitude  </th><th>pickup_latitude   </th><th>rate_code  </th><th>store_and_fwd_flag  </th><th>dropoff_longitude  </th><th>dropoff_latitude  </th><th>fare_amount       </th><th>surcharge  </th><th>mta_tax  </th><th>tip_amount        </th><th>tolls_amount  </th><th>total_amount      </th><th>tip_percentage     </th><th>pickup_hour  </th><th>pickup_day_of_week  </th><th>pickup_is_weekend  </th><th>right_pickup_hour  </th><th>right_tip_amount  </th><th>tip_amount_weekend  </th></tr>\n",
       "</thead>\n",
       "<tbody>\n",
       "<tr><td><i style='opacity: 0.6'>0</i>            </td><td>VTS        </td><td>2009-01-04 02:52:00.000000000</td><td>2009-01-04 03:02:00.000000000</td><td>1                </td><td>CASH          </td><td>2.630000114440918 </td><td>-73.99195861816406</td><td>40.72156524658203 </td><td>nan        </td><td>nan                 </td><td>-73.99380493164062 </td><td>40.6959228515625  </td><td>8.899999618530273 </td><td>0.5        </td><td>nan      </td><td>0.0               </td><td>0.0           </td><td>9.399999618530273 </td><td>0.0                </td><td>2            </td><td>6                   </td><td>1                  </td><td>2                  </td><td>1.0271794254275997</td><td>1.0636084270722341  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1</i>            </td><td>VTS        </td><td>2009-01-04 03:31:00.000000000</td><td>2009-01-04 03:38:00.000000000</td><td>3                </td><td>Credit        </td><td>4.550000190734863 </td><td>-73.98210144042969</td><td>40.736289978027344</td><td>nan        </td><td>nan                 </td><td>-73.95584869384766 </td><td>40.768028259277344</td><td>12.100000381469727</td><td>0.5        </td><td>nan      </td><td>2.0               </td><td>0.0           </td><td>14.600000381469727</td><td>0.13698630034923553</td><td>3            </td><td>6                   </td><td>1                  </td><td>3                  </td><td>1.0004258190973754</td><td>1.050149848503694   </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>2</i>            </td><td>DDS        </td><td>2009-01-01 20:52:58.000000000</td><td>2009-01-01 21:14:00.000000000</td><td>1                </td><td>CREDIT        </td><td>5.0               </td><td>-73.9742660522461 </td><td>40.79095458984375 </td><td>nan        </td><td>nan                 </td><td>-73.9965591430664  </td><td>40.731849670410156</td><td>14.899999618530273</td><td>0.5        </td><td>nan      </td><td>3.049999952316284 </td><td>0.0           </td><td>18.450000762939453</td><td>0.16531164944171906</td><td>20           </td><td>3                   </td><td>0                  </td><td>20                 </td><td>1.0202804182400866</td><td>0.911085101443365   </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>3</i>            </td><td>DDS        </td><td>2009-01-24 16:18:23.000000000</td><td>2009-01-24 16:24:56.000000000</td><td>1                </td><td>CASH          </td><td>0.4000000059604645</td><td>-74.00157928466797</td><td>40.719383239746094</td><td>nan        </td><td>nan                 </td><td>-74.00837707519531 </td><td>40.7203483581543  </td><td>3.700000047683716 </td><td>0.0        </td><td>nan      </td><td>0.0               </td><td>0.0           </td><td>3.700000047683716 </td><td>0.0                </td><td>16           </td><td>5                   </td><td>1                  </td><td>16                 </td><td>0.8844376076193646</td><td>0.8114473797450418  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>4</i>            </td><td>DDS        </td><td>2009-01-16 22:35:59.000000000</td><td>2009-01-16 22:43:35.000000000</td><td>2                </td><td>CASH          </td><td>1.2000000476837158</td><td>-73.98980712890625</td><td>40.73500442504883 </td><td>nan        </td><td>nan                 </td><td>-73.98502349853516 </td><td>40.72449493408203 </td><td>6.099999904632568 </td><td>0.5        </td><td>nan      </td><td>0.0               </td><td>0.0           </td><td>6.599999904632568 </td><td>0.0                </td><td>22           </td><td>4                   </td><td>0                  </td><td>22                 </td><td>1.0711251054555473</td><td>0.9458610337789833  </td></tr>\n",
       "<tr><td>...                                      </td><td>...        </td><td>...                          </td><td>...                          </td><td>...              </td><td>...           </td><td>...               </td><td>...               </td><td>...               </td><td>...        </td><td>...                 </td><td>...                </td><td>...               </td><td>...               </td><td>...        </td><td>...      </td><td>...               </td><td>...           </td><td>...               </td><td>...                </td><td>...          </td><td>...                 </td><td>...                </td><td>...                </td><td>...               </td><td>...                 </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,083,167,545</i></td><td>VTS        </td><td>2015-12-31 23:59:56.000000000</td><td>2016-01-01 00:08:18.000000000</td><td>5                </td><td>1             </td><td>1.2000000476837158</td><td>-73.99381256103516</td><td>40.72087097167969 </td><td>1.0        </td><td>0.0                 </td><td>-73.98621368408203 </td><td>40.722469329833984</td><td>7.5               </td><td>0.5        </td><td>0.5      </td><td>1.7599999904632568</td><td>0.0           </td><td>10.5600004196167  </td><td>0.1666666567325592 </td><td>23           </td><td>3                   </td><td>0                  </td><td>23                 </td><td>1.077161833123374 </td><td>0.9798870401138848  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,083,167,546</i></td><td>CMT        </td><td>2015-12-31 23:59:58.000000000</td><td>2016-01-01 00:05:19.000000000</td><td>2                </td><td>2             </td><td>2.0               </td><td>-73.96527099609375</td><td>40.76028060913086 </td><td>1.0        </td><td>0.0                 </td><td>-73.93951416015625 </td><td>40.75238800048828 </td><td>7.5               </td><td>0.5        </td><td>0.5      </td><td>0.0               </td><td>0.0           </td><td>8.800000190734863 </td><td>0.0                </td><td>23           </td><td>3                   </td><td>0                  </td><td>23                 </td><td>1.077161833123374 </td><td>0.9798870401138848  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,083,167,547</i></td><td>CMT        </td><td>2015-12-31 23:59:59.000000000</td><td>2016-01-01 00:12:55.000000000</td><td>2                </td><td>2             </td><td>3.799999952316284 </td><td>-73.98729705810547</td><td>40.739078521728516</td><td>1.0        </td><td>0.0                 </td><td>-73.9886703491211  </td><td>40.69329833984375 </td><td>13.5              </td><td>0.5        </td><td>0.5      </td><td>0.0               </td><td>0.0           </td><td>14.800000190734863</td><td>0.0                </td><td>23           </td><td>3                   </td><td>0                  </td><td>23                 </td><td>1.077161833123374 </td><td>0.9798870401138848  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,083,167,548</i></td><td>VTS        </td><td>2015-12-31 23:59:59.000000000</td><td>2016-01-01 00:10:26.000000000</td><td>1                </td><td>2             </td><td>1.9600000381469727</td><td>-73.99755859375   </td><td>40.72569274902344 </td><td>1.0        </td><td>0.0                 </td><td>-74.01712036132812 </td><td>40.705322265625   </td><td>8.5               </td><td>0.5        </td><td>0.5      </td><td>0.0               </td><td>0.0           </td><td>9.800000190734863 </td><td>0.0                </td><td>23           </td><td>3                   </td><td>0                  </td><td>23                 </td><td>1.077161833123374 </td><td>0.9798870401138848  </td></tr>\n",
       "<tr><td><i style='opacity: 0.6'>1,083,167,549</i></td><td>VTS        </td><td>2015-12-31 23:59:59.000000000</td><td>2016-01-01 00:21:30.000000000</td><td>1                </td><td>1             </td><td>1.059999942779541 </td><td>-73.9843978881836 </td><td>40.76725769042969 </td><td>1.0        </td><td>0.0                 </td><td>-73.99098205566406 </td><td>40.76057052612305 </td><td>13.5              </td><td>0.5        </td><td>0.5      </td><td>2.9600000381469727</td><td>0.0           </td><td>17.760000228881836</td><td>0.1666666716337204 </td><td>23           </td><td>3                   </td><td>0                  </td><td>23                 </td><td>1.077161833123374 </td><td>0.9798870401138848  </td></tr>\n",
       "</tbody>\n",
       "</table>"
      ],
      "text/plain": [
       "#              vendor_id    pickup_datetime                dropoff_datetime               passenger_count    payment_type    trip_distance       pickup_longitude    pickup_latitude     rate_code    store_and_fwd_flag    dropoff_longitude    dropoff_latitude    fare_amount         surcharge    mta_tax    tip_amount          tolls_amount    total_amount        tip_percentage       pickup_hour    pickup_day_of_week    pickup_is_weekend    right_pickup_hour    right_tip_amount    tip_amount_weekend\n",
       "0              VTS          2009-01-04 02:52:00.000000000  2009-01-04 03:02:00.000000000  1                  CASH            2.630000114440918   -73.99195861816406  40.72156524658203   nan          nan                   -73.99380493164062   40.6959228515625    8.899999618530273   0.5          nan        0.0                 0.0             9.399999618530273   0.0                  2              6                     1                    2                    1.0271794254275997  1.0636084270722341\n",
       "1              VTS          2009-01-04 03:31:00.000000000  2009-01-04 03:38:00.000000000  3                  Credit          4.550000190734863   -73.98210144042969  40.736289978027344  nan          nan                   -73.95584869384766   40.768028259277344  12.100000381469727  0.5          nan        2.0                 0.0             14.600000381469727  0.13698630034923553  3              6                     1                    3                    1.0004258190973754  1.050149848503694\n",
       "2              DDS          2009-01-01 20:52:58.000000000  2009-01-01 21:14:00.000000000  1                  CREDIT          5.0                 -73.9742660522461   40.79095458984375   nan          nan                   -73.9965591430664    40.731849670410156  14.899999618530273  0.5          nan        3.049999952316284   0.0             18.450000762939453  0.16531164944171906  20             3                     0                    20                   1.0202804182400866  0.911085101443365\n",
       "3              DDS          2009-01-24 16:18:23.000000000  2009-01-24 16:24:56.000000000  1                  CASH            0.4000000059604645  -74.00157928466797  40.719383239746094  nan          nan                   -74.00837707519531   40.7203483581543    3.700000047683716   0.0          nan        0.0                 0.0             3.700000047683716   0.0                  16             5                     1                    16                   0.8844376076193646  0.8114473797450418\n",
       "4              DDS          2009-01-16 22:35:59.000000000  2009-01-16 22:43:35.000000000  2                  CASH            1.2000000476837158  -73.98980712890625  40.73500442504883   nan          nan                   -73.98502349853516   40.72449493408203   6.099999904632568   0.5          nan        0.0                 0.0             6.599999904632568   0.0                  22             4                     0                    22                   1.0711251054555473  0.9458610337789833\n",
       "...            ...          ...                            ...                            ...                ...             ...                 ...                 ...                 ...          ...                   ...                  ...                 ...                 ...          ...        ...                 ...             ...                 ...                  ...            ...                   ...                  ...                  ...                 ...\n",
       "1,083,167,545  VTS          2015-12-31 23:59:56.000000000  2016-01-01 00:08:18.000000000  5                  1               1.2000000476837158  -73.99381256103516  40.72087097167969   1.0          0.0                   -73.98621368408203   40.722469329833984  7.5                 0.5          0.5        1.7599999904632568  0.0             10.5600004196167    0.1666666567325592   23             3                     0                    23                   1.077161833123374   0.9798870401138848\n",
       "1,083,167,546  CMT          2015-12-31 23:59:58.000000000  2016-01-01 00:05:19.000000000  2                  2               2.0                 -73.96527099609375  40.76028060913086   1.0          0.0                   -73.93951416015625   40.75238800048828   7.5                 0.5          0.5        0.0                 0.0             8.800000190734863   0.0                  23             3                     0                    23                   1.077161833123374   0.9798870401138848\n",
       "1,083,167,547  CMT          2015-12-31 23:59:59.000000000  2016-01-01 00:12:55.000000000  2                  2               3.799999952316284   -73.98729705810547  40.739078521728516  1.0          0.0                   -73.9886703491211    40.69329833984375   13.5                0.5          0.5        0.0                 0.0             14.800000190734863  0.0                  23             3                     0                    23                   1.077161833123374   0.9798870401138848\n",
       "1,083,167,548  VTS          2015-12-31 23:59:59.000000000  2016-01-01 00:10:26.000000000  1                  2               1.9600000381469727  -73.99755859375     40.72569274902344   1.0          0.0                   -74.01712036132812   40.705322265625     8.5                 0.5          0.5        0.0                 0.0             9.800000190734863   0.0                  23             3                     0                    23                   1.077161833123374   0.9798870401138848\n",
       "1,083,167,549  VTS          2015-12-31 23:59:59.000000000  2016-01-01 00:21:30.000000000  1                  1               1.059999942779541   -73.9843978881836   40.76725769042969   1.0          0.0                   -73.99098205566406   40.76057052612305   13.5                0.5          0.5        2.9600000381469727  0.0             17.760000228881836  0.1666666716337204   23             3                     0                    23                   1.077161833123374   0.9798870401138848"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = df.join(df_per_hour, on='pickup_hour', rprefix=\"right_\")\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Expensive columns\n",
    "\n",
    "Let's see the performance of Vaex on a computationally expensive virtual columns."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:15:43.476179Z",
     "start_time": "2020-06-10T17:15:43.468979Z"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "698.4273037383392"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "def arc_distance(theta_1, phi_1, theta_2, phi_2):\n",
    "    temp = (np.sin((theta_2-theta_1)/2*np.pi/180)**2\n",
    "           + np.cos(theta_1*np.pi/180)*np.cos(theta_2*np.pi/180) * np.sin((phi_2-phi_1)/2*np.pi/180)**2)\n",
    "    distance = 2 * np.arctan2(np.sqrt(temp), np.sqrt(1-temp))\n",
    "    return distance * 3958.8\n",
    "\n",
    "# distance Budapest - Utrecht [miles]\n",
    "arc_distance(47.4813602, 18.9902182, 52.0842715, 5.0124523)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-03T12:02:41.332378Z",
     "start_time": "2020-06-03T12:02:41.329591Z"
    }
   },
   "source": [
    "By default we are using numpy"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:16:01.358969Z",
     "start_time": "2020-06-10T17:16:01.351329Z"
    }
   },
   "outputs": [],
   "source": [
    "# Add the arc-distance in miles as a virtual column\n",
    "df['arc_distance_miles_numpy'] = arc_distance(df.pickup_longitude, df.pickup_latitude, \n",
    "                                              df.dropoff_longitude, df.dropoff_latitude)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:16:17.269794Z",
     "start_time": "2020-06-10T17:16:08.437006Z"
    }
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "13f7dbf95b754832ba50eabfb420b828",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1.0), Label(value='In progress...')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1.3943e+09\n"
     ]
    }
   ],
   "source": [
    "sum_numpy = df['arc_distance_miles_numpy'].sum(progress='widget')\n",
    "print(f'{sum_numpy:.5}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-03T12:03:09.567983Z",
     "start_time": "2020-06-03T12:03:09.565057Z"
    }
   },
   "source": [
    "We can accelerate this by using Just-In-Time compiling (JIT)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:16:28.770086Z",
     "start_time": "2020-06-10T17:16:27.985944Z"
    }
   },
   "outputs": [],
   "source": [
    "df['arc_distance_miles_numba'] = df.arc_distance_miles_numpy.jit_numba()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:16:32.130659Z",
     "start_time": "2020-06-10T17:16:30.480536Z"
    }
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "07054ee186494188aa76d98344bc3a16",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1.0), Label(value='In progress...')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1.3943e+09\n"
     ]
    }
   ],
   "source": [
    "sum_numba = df.arc_distance_miles_numba.sum(progress='widget')\n",
    "print(f'{sum_numba:.5}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Acceleration via a Nvidia GPU is also possible! This example uses _Nvidia 2080 super_."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 47,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:16:49.306057Z",
     "start_time": "2020-06-10T17:16:48.801878Z"
    }
   },
   "outputs": [],
   "source": [
    "df['arc_distance_miles_cuda'] = df.arc_distance_miles_numpy.jit_cuda()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {
    "ExecuteTime": {
     "end_time": "2020-06-10T17:16:54.593548Z",
     "start_time": "2020-06-10T17:16:49.667681Z"
    }
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "bfeef9f482f8494f9a9752bb8275aed0",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "HBox(children=(FloatProgress(value=0.0, max=1.0), Label(value='In progress...')))"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1.3943e+09\n"
     ]
    }
   ],
   "source": [
    "sum_cuda = df.arc_distance_miles_cuda.sum(progress='widget')\n",
    "print(f'{sum_cuda:.5}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### For a fuller picture please check out [the tutorial on the documentation pages](https://docs.vaex.io/en/latest/tutorial.html)."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
